summaryrefslogtreecommitdiff
path: root/docs/topics/db/transactions.txt
blob: 05a1771d7491e17551bbaa7d0297676811451c2d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
=====================
Database transactions
=====================

.. module:: django.db.transaction

Django gives you a few ways to control how database transactions are managed.

Managing database transactions
==============================

Django's default transaction behavior
-------------------------------------

Django's default behavior is to run in autocommit mode. Each query is
immediately committed to the database, unless a transaction is active.
:ref:`See below for details <autocommit-details>`.

Django uses transactions or savepoints automatically to guarantee the
integrity of ORM operations that require multiple queries, especially
:ref:`delete() <topics-db-queries-delete>` and :ref:`update()
<topics-db-queries-update>` queries.

Django's :class:`~django.test.TestCase` class also wraps each test in a
transaction for performance reasons.

.. _tying-transactions-to-http-requests:

Tying transactions to HTTP requests
-----------------------------------

A common way to handle transactions on the web is to wrap each request in a
transaction. Set :setting:`ATOMIC_REQUESTS <DATABASE-ATOMIC_REQUESTS>` to
``True`` in the configuration of each database for which you want to enable
this behavior.

It works like this. Before calling a view function, Django starts a
transaction. If the response is produced without problems, Django commits the
transaction. If the view produces an exception, Django rolls back the
transaction.

You may perform subtransactions using savepoints in your view code, typically
with the :func:`atomic` context manager. However, at the end of the view,
either all or none of the changes will be committed.

.. warning::

    While the simplicity of this transaction model is appealing, it also makes it
    inefficient when traffic increases. Opening a transaction for every view has
    some overhead. The impact on performance depends on the query patterns of your
    application and on how well your database handles locking.

.. admonition:: Per-request transactions and streaming responses

    When a view returns a :class:`~django.http.StreamingHttpResponse`, reading
    the contents of the response will often execute code to generate the
    content. Since the view has already returned, such code runs outside of
    the transaction.

    Generally speaking, it isn't advisable to write to the database while
    generating a streaming response, since there's no sensible way to handle
    errors after starting to send the response.

In practice, this feature simply wraps every view function in the :func:`atomic`
decorator described below.

Note that only the execution of your view is enclosed in the transactions.
Middleware runs outside of the transaction, and so does the rendering of
template responses.

When :setting:`ATOMIC_REQUESTS <DATABASE-ATOMIC_REQUESTS>` is enabled, it's
still possible to prevent views from running in a transaction.

.. function:: non_atomic_requests(using=None)

    This decorator will negate the effect of :setting:`ATOMIC_REQUESTS
    <DATABASE-ATOMIC_REQUESTS>` for a given view::

        from django.db import transaction

        @transaction.non_atomic_requests
        def my_view(request):
            do_stuff()

        @transaction.non_atomic_requests(using='other')
        def my_other_view(request):
            do_stuff_on_the_other_database()

    It only works if it's applied to the view itself.

Controlling transactions explicitly
-----------------------------------

Django provides a single API to control database transactions.

.. function:: atomic(using=None, savepoint=True)

    Atomicity is the defining property of database transactions. ``atomic``
    allows us to create a block of code within which the atomicity on the
    database is guaranteed. If the block of code is successfully completed, the
    changes are committed to the database. If there is an exception, the
    changes are rolled back.

    ``atomic`` blocks can be nested. In this case, when an inner block
    completes successfully, its effects can still be rolled back if an
    exception is raised in the outer block at a later point.

    ``atomic`` is usable both as a :py:term:`decorator`::

        from django.db import transaction

        @transaction.atomic
        def viewfunc(request):
            # This code executes inside a transaction.
            do_stuff()

    and as a :py:term:`context manager`::

        from django.db import transaction

        def viewfunc(request):
            # This code executes in autocommit mode (Django's default).
            do_stuff()

            with transaction.atomic():
                # This code executes inside a transaction.
                do_more_stuff()

    Wrapping ``atomic`` in a try/except block allows for natural handling of
    integrity errors::

        from django.db import IntegrityError, transaction

        @transaction.atomic
        def viewfunc(request):
            create_parent()

            try:
                with transaction.atomic():
                    generate_relationships()
            except IntegrityError:
                handle_exception()

            add_children()

    In this example, even if ``generate_relationships()`` causes a database
    error by breaking an integrity constraint, you can execute queries in
    ``add_children()``, and the changes from ``create_parent()`` are still
    there. Note that any operations attempted in ``generate_relationships()``
    will already have been rolled back safely when ``handle_exception()`` is
    called, so the exception handler can also operate on the database if
    necessary.

    .. admonition:: Avoid catching exceptions inside ``atomic``!

        When exiting an ``atomic`` block, Django looks at whether it's exited
        normally or with an exception to determine whether to commit or roll
        back. If you catch and handle exceptions inside an ``atomic`` block,
        you may hide from Django the fact that a problem has happened. This
        can result in unexpected behavior.

        This is mostly a concern for :exc:`~django.db.DatabaseError` and its
        subclasses such as :exc:`~django.db.IntegrityError`. After such an
        error, the transaction is broken and Django will perform a rollback at
        the end of the ``atomic`` block. If you attempt to run database
        queries before the rollback happens, Django will raise a
        :class:`~django.db.transaction.TransactionManagementError`. You may
        also encounter this behavior when an ORM-related signal handler raises
        an exception.

        The correct way to catch database errors is around an ``atomic`` block
        as shown above. If necessary, add an extra ``atomic`` block for this
        purpose. This pattern has another advantage: it delimits explicitly
        which operations will be rolled back if an exception occurs.

        If you catch exceptions raised by raw SQL queries, Django's behavior
        is unspecified and database-dependent.

    In order to guarantee atomicity, ``atomic`` disables some APIs. Attempting
    to commit, roll back, or change the autocommit state of the database
    connection within an ``atomic`` block will raise an exception.

    ``atomic`` takes a ``using`` argument which should be the name of a
    database. If this argument isn't provided, Django uses the ``"default"``
    database.

    Under the hood, Django's transaction management code:

    - opens a transaction when entering the outermost ``atomic`` block;
    - creates a savepoint when entering an inner ``atomic`` block;
    - releases or rolls back to the savepoint when exiting an inner block;
    - commits or rolls back the transaction when exiting the outermost block.

    You can disable the creation of savepoints for inner blocks by setting the
    ``savepoint`` argument to ``False``. If an exception occurs, Django will
    perform the rollback when exiting the first parent block with a savepoint
    if there is one, and the outermost block otherwise. Atomicity is still
    guaranteed by the outer transaction. This option should only be used if
    the overhead of savepoints is noticeable. It has the drawback of breaking
    the error handling described above.

    You may use ``atomic`` when autocommit is turned off. It will only use
    savepoints, even for the outermost block.

.. admonition:: Performance considerations

    Open transactions have a performance cost for your database server. To
    minimize this overhead, keep your transactions as short as possible. This
    is especially important if you're using :func:`atomic` in long-running
    processes, outside of Django's request / response cycle.

Autocommit
==========

.. _autocommit-details:

Why Django uses autocommit
--------------------------

In the SQL standards, each SQL query starts a transaction, unless one is
already active. Such transactions must then be explicitly committed or rolled
back.

This isn't always convenient for application developers. To alleviate this
problem, most databases provide an autocommit mode. When autocommit is turned
on and no transaction is active, each SQL query gets wrapped in its own
transaction. In other words, not only does each such query start a
transaction, but the transaction also gets automatically committed or rolled
back, depending on whether the query succeeded.

:pep:`249`, the Python Database API Specification v2.0, requires autocommit to
be initially turned off. Django overrides this default and turns autocommit
on.

To avoid this, you can :ref:`deactivate the transaction management
<deactivate-transaction-management>`, but it isn't recommended.

.. _deactivate-transaction-management:

Deactivating transaction management
-----------------------------------

You can totally disable Django's transaction management for a given database
by setting :setting:`AUTOCOMMIT <DATABASE-AUTOCOMMIT>` to ``False`` in its
configuration. If you do this, Django won't enable autocommit, and won't
perform any commits. You'll get the regular behavior of the underlying
database library.

This requires you to commit explicitly every transaction, even those started
by Django or by third-party libraries. Thus, this is best used in situations
where you want to run your own transaction-controlling middleware or do
something really strange.

Performing actions after commit
===============================

.. versionadded:: 1.9

Sometimes you need to perform an action related to the current database
transaction, but only if the transaction successfully commits. Examples might
include a `Celery`_ task, an email notification, or a cache invalidation.

.. _Celery: http://www.celeryproject.org/

Django provides the :func:`on_commit` function to register callback functions
that should be executed after a transaction is successfully committed:

.. function:: on_commit(func, using=None)

Pass any function (that takes no arguments) to :func:`on_commit`::

    from django.db import transaction

    def do_something():
        pass  # send a mail, invalidate a cache, fire off a Celery task, etc.

    transaction.on_commit(do_something)

You can also wrap your function in a lambda::

    transaction.on_commit(lambda: some_celery_task.delay('arg1'))

The function you pass in will be called immediately after a hypothetical
database write made where ``on_commit()`` is called would be successfully
committed.

If you call ``on_commit()`` while there isn't an active transaction, the
callback will be executed immediately.

If that hypothetical database write is instead rolled back (typically when an
unhandled exception is raised in an :func:`atomic` block), your function will
be discarded and never called.

Savepoints
----------

Savepoints (i.e. nested :func:`atomic` blocks) are handled correctly. That is,
an :func:`on_commit` callable registered after a savepoint (in a nested
:func:`atomic` block) will be called after the outer transaction is committed,
but not if a rollback to that savepoint or any previous savepoint occurred
during the transaction::

    with transaction.atomic():  # Outer atomic, start a new transaction
        transaction.on_commit(foo)

        with transaction.atomic():  # Inner atomic block, create a savepoint
            transaction.on_commit(bar)

    # foo() and then bar() will be called when leaving the outermost block

On the other hand, when a savepoint is rolled back (due to an exception being
raised), the inner callable will not be called::

    with transaction.atomic():  # Outer atomic, start a new transaction
        transaction.on_commit(foo)

        try:
            with transaction.atomic():  # Inner atomic block, create a savepoint
                transaction.on_commit(bar)
                raise SomeError()  # Raising an exception - abort the savepoint
        except SomeError:
            pass

    # foo() will be called, but not bar()

Order of execution
------------------

On-commit functions for a given transaction are executed in the order they were
registered.

Exception handling
------------------

If one on-commit function within a given transaction raises an uncaught
exception, no later registered functions in that same transaction will run.
This is, of course, the same behavior as if you'd executed the functions
sequentially yourself without :func:`on_commit`.

Timing of execution
-------------------

Your callbacks are executed *after* a successful commit, so a failure in a
callback will not cause the transaction to roll back. They are executed
conditionally upon the success of the transaction, but they are not *part* of
the transaction. For the intended use cases (mail notifications, Celery tasks,
etc.), this should be fine. If it's not (if your follow-up action is so
critical that its failure should mean the failure of the transaction itself),
then you don't want to use the :func:`on_commit` hook. Instead, you may want
`two-phase commit`_ such as the `psycopg Two-Phase Commit protocol support`_
and the `optional Two-Phase Commit Extensions in the Python DB-API
specification`_.

Callbacks are not run until autocommit is restored on the connection following
the commit (because otherwise any queries done in a callback would open an
implicit transaction, preventing the connection from going back into autocommit
mode).

When in autocommit mode and outside of an :func:`atomic` block, the function
will run immediately, not on commit.

On-commit functions only work with :ref:`autocommit mode <managing-autocommit>`
and the :func:`atomic` (or :setting:`ATOMIC_REQUESTS
<DATABASE-ATOMIC_REQUESTS>`) transaction API. Calling :func:`on_commit` when
autocommit is disabled and you are not within an atomic block will result in an
error.

.. _two-phase commit: https://en.wikipedia.org/wiki/Two-phase_commit_protocol
.. _psycopg Two-Phase Commit protocol support: http://initd.org/psycopg/docs/usage.html#tpc
.. _optional Two-Phase Commit Extensions in the Python DB-API specification: https://www.python.org/dev/peps/pep-0249/#optional-two-phase-commit-extensions

Use in tests
------------

Django's :class:`~django.test.TestCase` class wraps each test in a transaction
and rolls back that transaction after each test, in order to provide test
isolation. This means that no transaction is ever actually committed, thus your
:func:`on_commit` callbacks will never be run. If you need to test the results
of an :func:`on_commit` callback, use a
:class:`~django.test.TransactionTestCase` instead.

Why no rollback hook?
---------------------

A rollback hook is harder to implement robustly than a commit hook, since a
variety of things can cause an implicit rollback.

For instance, if your database connection is dropped because your process was
killed without a chance to shut down gracefully, your rollback hook will never
run.

The solution is simple: instead of doing something during the atomic block
(transaction) and then undoing it if the transaction fails, use
:func:`on_commit` to delay doing it in the first place until after the
transaction succeeds. It’s a lot easier to undo something you never did in the
first place!

Low-level APIs
==============

.. warning::

    Always prefer :func:`atomic` if possible at all. It accounts for the
    idiosyncrasies of each database and prevents invalid operations.

    The low level APIs are only useful if you're implementing your own
    transaction management.

.. _managing-autocommit:

Autocommit
----------

Django provides a straightforward API in the :mod:`django.db.transaction`
module to manage the autocommit state of each database connection.

.. function:: get_autocommit(using=None)

.. function:: set_autocommit(autocommit, using=None)

These functions take a ``using`` argument which should be the name of a
database. If it isn't provided, Django uses the ``"default"`` database.

Autocommit is initially turned on. If you turn it off, it's your
responsibility to restore it.

Once you turn autocommit off, you get the default behavior of your database
adapter, and Django won't help you. Although that behavior is specified in
:pep:`249`, implementations of adapters aren't always consistent with one
another. Review the documentation of the adapter you're using carefully.

You must ensure that no transaction is active, usually by issuing a
:func:`commit` or a :func:`rollback`, before turning autocommit back on.

Django will refuse to turn autocommit off when an :func:`atomic` block is
active, because that would break atomicity.

Transactions
------------

A transaction is an atomic set of database queries. Even if your program
crashes, the database guarantees that either all the changes will be applied,
or none of them.

Django doesn't provide an API to start a transaction. The expected way to
start a transaction is to disable autocommit with :func:`set_autocommit`.

Once you're in a transaction, you can choose either to apply the changes
you've performed until this point with :func:`commit`, or to cancel them with
:func:`rollback`. These functions are defined in :mod:`django.db.transaction`.

.. function:: commit(using=None)

.. function:: rollback(using=None)

These functions take a ``using`` argument which should be the name of a
database. If it isn't provided, Django uses the ``"default"`` database.

Django will refuse to commit or to rollback when an :func:`atomic` block is
active, because that would break atomicity.

.. _topics-db-transactions-savepoints:

Savepoints
----------

A savepoint is a marker within a transaction that enables you to roll back
part of a transaction, rather than the full transaction. Savepoints are
available with the SQLite (≥ 3.6.8), PostgreSQL, Oracle and MySQL (when using
the InnoDB storage engine) backends. Other backends provide the savepoint
functions, but they're empty operations -- they don't actually do anything.

Savepoints aren't especially useful if you are using autocommit, the default
behavior of Django. However, once you open a transaction with :func:`atomic`,
you build up a series of database operations awaiting a commit or rollback. If
you issue a rollback, the entire transaction is rolled back. Savepoints
provide the ability to perform a fine-grained rollback, rather than the full
rollback that would be performed by ``transaction.rollback()``.

When the :func:`atomic` decorator is nested, it creates a savepoint to allow
partial commit or rollback. You're strongly encouraged to use :func:`atomic`
rather than the functions described below, but they're still part of the
public API, and there's no plan to deprecate them.

Each of these functions takes a ``using`` argument which should be the name of
a database for which the behavior applies.  If no ``using`` argument is
provided then the ``"default"`` database is used.

Savepoints are controlled by three functions in :mod:`django.db.transaction`:

.. function:: savepoint(using=None)

    Creates a new savepoint. This marks a point in the transaction that is
    known to be in a "good" state. Returns the savepoint ID (``sid``).

.. function:: savepoint_commit(sid, using=None)

    Releases savepoint ``sid``. The changes performed since the savepoint was
    created become part of the transaction.

.. function:: savepoint_rollback(sid, using=None)

    Rolls back the transaction to savepoint ``sid``.

These functions do nothing if savepoints aren't supported or if the database
is in autocommit mode.

In addition, there's a utility function:

.. function:: clean_savepoints(using=None)

    Resets the counter used to generate unique savepoint IDs.

The following example demonstrates the use of savepoints::

    from django.db import transaction

    # open a transaction
    @transaction.atomic
    def viewfunc(request):

        a.save()
        # transaction now contains a.save()

        sid = transaction.savepoint()

        b.save()
        # transaction now contains a.save() and b.save()

        if want_to_keep_b:
            transaction.savepoint_commit(sid)
            # open transaction still contains a.save() and b.save()
        else:
            transaction.savepoint_rollback(sid)
            # open transaction now contains only a.save()

Savepoints may be used to recover from a database error by performing a partial
rollback. If you're doing this inside an :func:`atomic` block, the entire block
will still be rolled back, because it doesn't know you've handled the situation
at a lower level! To prevent this, you can control the rollback behavior with
the following functions.

.. function:: get_rollback(using=None)

.. function:: set_rollback(rollback, using=None)

Setting the rollback flag to ``True`` forces a rollback when exiting the
innermost atomic block. This may be useful to trigger a rollback without
raising an exception.

Setting it to ``False`` prevents such a rollback. Before doing that, make sure
you've rolled back the transaction to a known-good savepoint within the current
atomic block! Otherwise you're breaking atomicity and data corruption may
occur.

Database-specific notes
=======================

.. _savepoints-in-sqlite:

Savepoints in SQLite
--------------------

While SQLite ≥ 3.6.8 supports savepoints, a flaw in the design of the
:mod:`sqlite3` module makes them hardly usable.

When autocommit is enabled, savepoints don't make sense. When it's disabled,
:mod:`sqlite3` commits implicitly before savepoint statements. (In fact, it
commits before any statement other than ``SELECT``, ``INSERT``, ``UPDATE``,
``DELETE`` and ``REPLACE``.) This bug has two consequences:

- The low level APIs for savepoints are only usable inside a transaction ie.
  inside an :func:`atomic` block.
- It's impossible to use :func:`atomic` when autocommit is turned off.

Transactions in MySQL
---------------------

If you're using MySQL, your tables may or may not support transactions; it
depends on your MySQL version and the table types you're using. (By
"table types," we mean something like "InnoDB" or "MyISAM".) MySQL transaction
peculiarities are outside the scope of this article, but the MySQL site has
`information on MySQL transactions`_.

If your MySQL setup does *not* support transactions, then Django will always
function in autocommit mode: statements will be executed and committed as soon
as they're called. If your MySQL setup *does* support transactions, Django
will handle transactions as explained in this document.

.. _information on MySQL transactions: https://dev.mysql.com/doc/refman/5.6/en/sql-syntax-transactions.html

Handling exceptions within PostgreSQL transactions
--------------------------------------------------

.. note::

    This section is relevant only if you're implementing your own transaction
    management. This problem cannot occur in Django's default mode and
    :func:`atomic` handles it automatically.

Inside a transaction, when a call to a PostgreSQL cursor raises an exception
(typically ``IntegrityError``), all subsequent SQL in the same transaction
will fail with the error "current transaction is aborted, queries ignored
until end of transaction block". While simple use of ``save()`` is unlikely
to raise an exception in PostgreSQL, there are more advanced usage patterns
which might, such as saving objects with unique fields, saving using the
force_insert/force_update flag, or invoking custom SQL.

There are several ways to recover from this sort of error.

Transaction rollback
~~~~~~~~~~~~~~~~~~~~

The first option is to roll back the entire transaction. For example::

    a.save() # Succeeds, but may be undone by transaction rollback
    try:
        b.save() # Could throw exception
    except IntegrityError:
        transaction.rollback()
    c.save() # Succeeds, but a.save() may have been undone

Calling ``transaction.rollback()`` rolls back the entire transaction. Any
uncommitted database operations will be lost. In this example, the changes
made by ``a.save()`` would be lost, even though that operation raised no error
itself.

Savepoint rollback
~~~~~~~~~~~~~~~~~~

You can use :ref:`savepoints <topics-db-transactions-savepoints>` to control
the extent of a rollback. Before performing a database operation that could
fail, you can set or update the savepoint; that way, if the operation fails,
you can roll back the single offending operation, rather than the entire
transaction. For example::

    a.save() # Succeeds, and never undone by savepoint rollback
    sid = transaction.savepoint()
    try:
        b.save() # Could throw exception
        transaction.savepoint_commit(sid)
    except IntegrityError:
        transaction.savepoint_rollback(sid)
    c.save() # Succeeds, and a.save() is never undone

In this example, ``a.save()`` will not be undone in the case where
``b.save()`` raises an exception.