summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMike Bayer <mike_mp@zzzcomputing.com>2014-06-13 13:11:09 -0400
committerMike Bayer <mike_mp@zzzcomputing.com>2014-06-13 13:11:09 -0400
commit120f417b8bd067a2d6f85a833ec3f1458208d837 (patch)
treec0c06ea0c45f31a00ae17e7185fc2cc88e127369
parente4c8e36a2c00129fb141cafc215296e098c1606d (diff)
downloaddogpile-cache-120f417b8bd067a2d6f85a833ec3f1458208d837.tar.gz
- edit the recipe for async orm updates
-rw-r--r--docs/build/conf.py3
-rw-r--r--docs/build/requirements.txt2
-rw-r--r--docs/build/usage.rst89
3 files changed, 42 insertions, 52 deletions
diff --git a/docs/build/conf.py b/docs/build/conf.py
index 1f5aef8..196bff3 100644
--- a/docs/build/conf.py
+++ b/docs/build/conf.py
@@ -29,7 +29,8 @@ import dogpile.cache
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
-extensions = ['sphinx.ext.autodoc', 'sphinx.ext.intersphinx', 'changelog']
+extensions = ['sphinx.ext.autodoc', 'sphinx.ext.intersphinx',
+ 'changelog', 'sphinx_paramlinks']
changelog_sections = ["feature", "bug"]
diff --git a/docs/build/requirements.txt b/docs/build/requirements.txt
index 9ac766c..eb64044 100644
--- a/docs/build/requirements.txt
+++ b/docs/build/requirements.txt
@@ -1,4 +1,4 @@
mako
dogpile
changelog
-
+sphinx-paramlinks
diff --git a/docs/build/usage.rst b/docs/build/usage.rst
index 0c41554..dadfa92 100644
--- a/docs/build/usage.rst
+++ b/docs/build/usage.rst
@@ -349,15 +349,14 @@ in order to invalidate everything having to do with that id.
print user_fn_two(7)
-Refreshing After Database Update with SQLAlchemy
-------------------------------------------------
+Asynchronous Data Updates with ORM Events
+-----------------------------------------
-If you use SQLAlchemy and cache query results (for example after they are
-processed) you may find yourself in need of invalidating the cache after the
-database has been altered to provide the most up-to-date results. For example,
-if you have a decorated function like this:
+This recipe presents one technique of optimistically pushing new data
+into the cache when an update is sent to a database.
-.. code-block:: python
+Using SQLAlchemy for database querying, suppose a simple cache-decorated
+function returns the results of a database query::
@region.cache_on_arguments()
def get_some_data(argument):
@@ -365,59 +364,49 @@ if you have a decorated function like this:
data = Session().query(DBClass).filter(DBClass.argument == argument).all()
return data
-If you change the database and want this function to return fresh data, you
-could just call ``get_some_data.invalidate(argument)``. However, this only
-leads to the deletion of the old value. A new value is not generated until the
-next call. A common usage for a cache is if fetching the data is time-intense,
-so every once in a while a client has to wait for his new data. Another
-alternative would be to recreate the cache on the function adding new data. For
-example, say this function updates the database:
-
-.. code-block:: python
-
- def add_new_data(argument):
- Session().add(DBClass(argument=argument))
-
-Inside this class we could call the ``refresh`` function to generate new data
-or the ``invalidate`` function to just remove the old data:
-
-.. code-block:: python
-
- def add_new_data(argument):
- # ...
- # Either refresh the data
- get_some_data.refresh(argument)
- # Or invalidate it
- get_some_data.invalidate(argument)
-
-However, this leads to a new issue: Now the client adding new data has to wait.
-Instead, you can use the folloing recipe to regenerate the value after the data
-has been saved to the database. Since you start a new thread, no client has to
-wait. And because of dogpile-locking the cache will return fresh data as soon
-as it is done and return old data before.
-
-.. code-block:: python
-
- def cache_refresh(refresher, *args, **kwargs):
+We would like this particular function to be re-queried when the data
+has changed. We could call ``get_some_data.invalidate(argument, hard=False)``
+at the point at which the data changes, however this only
+leads to the invalidation of the old value; a new value is not generated until
+the next call, and also means at least one client has to block while the
+new value is generated. We could also call
+``get_some_data.refresh(argument)``, which would perform the data refresh
+at that moment, but then the writer is delayed by the re-query.
+
+A third variant is to instead offload the work of refreshing for this query
+into a background thread or process. This can be acheived using
+a system such as the :paramref:`.CacheRegion.async_creation_runner`.
+However, an expedient approach for smaller use cases is to link cache refresh
+operations to the ORM session's commit, as below::
+
+ from sqlalchemy import event
+ from sqlalchemy.orm.Session
+
+ def cache_refresh(session, refresher, *args, **kwargs):
"""
Refresh the functions cache data in a new thread. Starts refreshing only
after the session was committed so all database data is available.
"""
+ assert isinstance(session, Session), \
+ "Need a session, not a sessionmaker or scoped_session"
+ @event.listens_for(session, "after_commit")
def do_refresh(session):
t = Thread(target=refresher, args=args, kwargs=kwargs)
t.daemon = True
t.start()
- event.listen(Session(), "after_commit", do_refresh)
-And its usage:
+Within a sequence of data persistence, ``cache_refresh`` can be called
+given a particular SQLAlchemy ``Session`` and a callable to do the work::
-.. code-block:: python
+ def add_new_data(session, argument):
+ # add some data
+ session.add(something_new(argument))
- def add_new_data(argument):
- # ...
- cache_refresh(get_some_data.refresh, argument)
+ # add a hook to refresh after the Session is committed.
+ cache_refresh(session, get_some_data.refresh, argument)
-This example assumes that you have a threadlocal session factory as is common
-with web applications. Otherwise you might need to pass in your session to this
-function as well.
+Note that the event to refresh the data is associated with the ``Session``
+being used for persistence; however, the actual refresh operation is called
+with a **different** ``Session``, typically one that is local to the refresh
+operation, either through a thread-local registry or via direct instantiation. \ No newline at end of file