summaryrefslogtreecommitdiff
path: root/doc/build/orm/versioning.rst
blob: 87865917cdf46861063c0375936482aacd8f54ea (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
.. _mapper_version_counter:

Configuring a Version Counter
=============================

The :class:`_orm.Mapper` supports management of a :term:`version id column`, which
is a single table column that increments or otherwise updates its value
each time an ``UPDATE`` to the mapped table occurs.  This value is checked each
time the ORM emits an ``UPDATE`` or ``DELETE`` against the row to ensure that
the value held in memory matches the database value.

.. warning::

    Because the versioning feature relies upon comparison of the **in memory**
    record of an object, the feature only applies to the :meth:`.Session.flush`
    process, where the ORM flushes individual in-memory rows to the database.
    It does **not** take effect when performing
    a multirow UPDATE or DELETE using :meth:`_query.Query.update` or :meth:`_query.Query.delete`
    methods, as these methods only emit an UPDATE or DELETE statement but otherwise
    do not have direct access to the contents of those rows being affected.

The purpose of this feature is to detect when two concurrent transactions
are modifying the same row at roughly the same time, or alternatively to provide
a guard against the usage of a "stale" row in a system that might be re-using
data from a previous transaction without refreshing (e.g. if one sets ``expire_on_commit=False``
with a :class:`.Session`, it is possible to re-use the data from a previous
transaction).

.. topic:: Concurrent transaction updates

    When detecting concurrent updates within transactions, it is typically the
    case that the database's transaction isolation level is below the level of
    :term:`repeatable read`; otherwise, the transaction will not be exposed
    to a new row value created by a concurrent update which conflicts with
    the locally updated value.  In this case, the SQLAlchemy versioning
    feature will typically not be useful for in-transaction conflict detection,
    though it still can be used for cross-transaction staleness detection.

    The database that enforces repeatable reads will typically either have locked the
    target row against a concurrent update, or is employing some form
    of multi version concurrency control such that it will emit an error
    when the transaction is committed.  SQLAlchemy's version_id_col is an alternative
    which allows version tracking to occur for specific tables within a transaction
    that otherwise might not have this isolation level set.

    .. seealso::

        `Repeatable Read Isolation Level <https://www.postgresql.org/docs/current/static/transaction-iso.html#XACT-REPEATABLE-READ>`_ - PostgreSQL's implementation of repeatable read, including a description of the error condition.

Simple Version Counting
-----------------------

The most straightforward way to track versions is to add an integer column
to the mapped table, then establish it as the ``version_id_col`` within the
mapper options::

    class User(Base):
        __tablename__ = "user"

        id = mapped_column(Integer, primary_key=True)
        version_id = mapped_column(Integer, nullable=False)
        name = mapped_column(String(50), nullable=False)

        __mapper_args__ = {"version_id_col": version_id}

.. note::  It is **strongly recommended** that the ``version_id`` column
   be made NOT NULL.  The versioning feature **does not support** a NULL
   value in the versioning column.

Above, the ``User`` mapping tracks integer versions using the column
``version_id``.   When an object of type ``User`` is first flushed, the
``version_id`` column will be given a value of "1".   Then, an UPDATE
of the table later on will always be emitted in a manner similar to the
following:

.. sourcecode:: sql

    UPDATE user SET version_id=:version_id, name=:name
    WHERE user.id = :user_id AND user.version_id = :user_version_id
    -- {"name": "new name", "version_id": 2, "user_id": 1, "user_version_id": 1}

The above UPDATE statement is updating the row that not only matches
``user.id = 1``, it also is requiring that ``user.version_id = 1``, where "1"
is the last version identifier we've been known to use on this object.
If a transaction elsewhere has modified the row independently, this version id
will no longer match, and the UPDATE statement will report that no rows matched;
this is the condition that SQLAlchemy tests, that exactly one row matched our
UPDATE (or DELETE) statement.  If zero rows match, that indicates our version
of the data is stale, and a :exc:`.StaleDataError` is raised.

.. _custom_version_counter:

Custom Version Counters / Types
-------------------------------

Other kinds of values or counters can be used for versioning.  Common types include
dates and GUIDs.   When using an alternate type or counter scheme, SQLAlchemy
provides a hook for this scheme using the ``version_id_generator`` argument,
which accepts a version generation callable.  This callable is passed the value of the current
known version, and is expected to return the subsequent version.

For example, if we wanted to track the versioning of our ``User`` class
using a randomly generated GUID, we could do this (note that some backends
support a native GUID type, but we illustrate here using a simple string)::

    import uuid


    class User(Base):
        __tablename__ = "user"

        id = mapped_column(Integer, primary_key=True)
        version_uuid = mapped_column(String(32), nullable=False)
        name = mapped_column(String(50), nullable=False)

        __mapper_args__ = {
            "version_id_col": version_uuid,
            "version_id_generator": lambda version: uuid.uuid4().hex,
        }

The persistence engine will call upon ``uuid.uuid4()`` each time a
``User`` object is subject to an INSERT or an UPDATE.  In this case, our
version generation function can disregard the incoming value of ``version``,
as the ``uuid4()`` function
generates identifiers without any prerequisite value.  If we were using
a sequential versioning scheme such as numeric or a special character system,
we could make use of the given ``version`` in order to help determine the
subsequent value.

.. seealso::

    :ref:`custom_guid_type`

.. _server_side_version_counter:

Server Side Version Counters
----------------------------

The ``version_id_generator`` can also be configured to rely upon a value
that is generated by the database.  In this case, the database would need
some means of generating new identifiers when a row is subject to an INSERT
as well as with an UPDATE.   For the UPDATE case, typically an update trigger
is needed, unless the database in question supports some other native
version identifier.  The PostgreSQL database in particular supports a system
column called `xmin <https://www.postgresql.org/docs/current/static/ddl-system-columns.html>`_
which provides UPDATE versioning.  We can make use
of the PostgreSQL ``xmin`` column to version our ``User``
class as follows::

    from sqlalchemy import FetchedValue


    class User(Base):
        __tablename__ = "user"

        id = mapped_column(Integer, primary_key=True)
        name = mapped_column(String(50), nullable=False)
        xmin = mapped_column("xmin", String, system=True, server_default=FetchedValue())

        __mapper_args__ = {"version_id_col": xmin, "version_id_generator": False}

With the above mapping, the ORM will rely upon the ``xmin`` column for
automatically providing the new value of the version id counter.

.. topic:: creating tables that refer to system columns

    In the above scenario, as ``xmin`` is a system column provided by PostgreSQL,
    we use the ``system=True`` argument to mark it as a system-provided
    column, omitted from the ``CREATE TABLE`` statement.   The datatype of this
    column is an internal PostgreSQL type called ``xid`` which acts mostly
    like a string, so we use the :class:`_types.String` datatype.


The ORM typically does not actively fetch the values of database-generated
values when it emits an INSERT or UPDATE, instead leaving these columns as
"expired" and to be fetched when they are next accessed, unless the ``eager_defaults``
:class:`_orm.Mapper` flag is set.  However, when a
server side version column is used, the ORM needs to actively fetch the newly
generated value.  This is so that the version counter is set up *before*
any concurrent transaction may update it again.   This fetching is also
best done simultaneously within the INSERT or UPDATE statement using :term:`RETURNING`,
otherwise if emitting a SELECT statement afterwards, there is still a potential
race condition where the version counter may change before it can be fetched.

When the target database supports RETURNING, an INSERT statement for our ``User`` class will look
like this:

.. sourcecode:: sql

    INSERT INTO "user" (name) VALUES (%(name)s) RETURNING "user".id, "user".xmin
    -- {'name': 'ed'}

Where above, the ORM can acquire any newly generated primary key values along
with server-generated version identifiers in one statement.   When the backend
does not support RETURNING, an additional SELECT must be emitted for **every**
INSERT and UPDATE, which is much less efficient, and also introduces the possibility of
missed version counters:

.. sourcecode:: sql

    INSERT INTO "user" (name) VALUES (%(name)s)
    -- {'name': 'ed'}

    SELECT "user".version_id AS user_version_id FROM "user" where
    "user".id = :param_1
    -- {"param_1": 1}

It is *strongly recommended* that server side version counters only be used
when absolutely necessary and only on backends that support :term:`RETURNING`,
currently PostgreSQL, Oracle, MariaDB 10.5, SQLite 3.35, and SQL Server.


Programmatic or Conditional Version Counters
--------------------------------------------

When ``version_id_generator`` is set to False, we can also programmatically
(and conditionally) set the version identifier on our object in the same way
we assign any other mapped attribute.  Such as if we used our UUID example, but
set ``version_id_generator`` to ``False``, we can set the version identifier
at our choosing::

    import uuid


    class User(Base):
        __tablename__ = "user"

        id = mapped_column(Integer, primary_key=True)
        version_uuid = mapped_column(String(32), nullable=False)
        name = mapped_column(String(50), nullable=False)

        __mapper_args__ = {"version_id_col": version_uuid, "version_id_generator": False}


    u1 = User(name="u1", version_uuid=uuid.uuid4())

    session.add(u1)

    session.commit()

    u1.name = "u2"
    u1.version_uuid = uuid.uuid4()

    session.commit()

We can update our ``User`` object without incrementing the version counter
as well; the value of the counter will remain unchanged, and the UPDATE
statement will still check against the previous value.  This may be useful
for schemes where only certain classes of UPDATE are sensitive to concurrency
issues::

    # will leave version_uuid unchanged
    u1.name = "u3"
    session.commit()