delta/python-packages/sqlalchemy.git/lib/sqlalchemy/sql/annotation.py, branch review/mike_bayer/tutorial20

Turn on caching everywhere, add logging

2020-06-10T19:29:01+00:00

A variety of caching issues found by running
all tests with statement caching turned on.

The cache system now has a more conservative approach where
any subclass of a SQL element will by default invalidate
the cache key unless it adds the flag inherit_cache=True
at the class level, or if it implements its own caching.

Add working caching to a few elements that were
omitted previously; fix some caching implementations
to suit lesser used edge cases such as json casts
and array slices.

Refine the way BaseCursorResult and CursorMetaData
interact with caching; to suit cases like Alembic
modifying table structures, don't cache the
cursor metadata if it were created against a
cursor.description using non-positional matching,
e.g. "select *".   if a table re-ordered its columns
or added/removed, now that data is obsolete.

Additionally we have to adapt the cursor metadata
_keymap regardless of if we just processed
cursor.description, because if we ran against
a cached SQLCompiler we won't have the right
columns in _keymap.

Other refinements to how and when we do this
adaption as some weird cases
were exposed in the Postgresql dialect,
a text() construct that names just one column that
is not actually in the statement.   Fixed that
also as it looks like a cut-and-paste artifact
that doesn't actually affect anything.

Various issues with re-use of compiled result maps
and cursor metadata in conjunction with tables being
changed, such as change in order of columns.

mappers can be cleared but the class remains, meaning
a mapper has to use itself as the cache key not the class.

lots of bound parameter / literal issues, due to Alembic
creating a straight subclass of bindparam that renders
inline directly.   While we can update Alembic to not
do this, we have to assume other people might be doing
this, so bindparam() implements the inherit_cache=True
logic as well that was a bit involved.

turn on cache stats in logging.

Includes a fix to subqueryloader which moves all setup to
the create_row_processor() phase and elminates any storage
within the compiled context.   This includes some changes
to create_row_processor() signature and a revising of the
technique used to determine if the loader can participate
in polymorphic queries, which is also applied to
selectinloading.

DML update.values() and ordered_values() now coerces the
keys as we have tests that pass an arbitrary class here
which only includes __clause_element__(), so the
key can't be cached unless it is coerced.  this in turn
changed how composite attributes support bulk update
to use the standard approach of ClauseElement with
annotations that are parsed in the ORM context.

memory profiling successfully caught that the Session
from Query was getting passed into _statement_20()
so that was a big win for that test suite.

Apparently Compiler had .execute() and .scalar() methods
stuck on it, these date back to version 0.4 and there
was a single test in the PostgreSQL dialect tests
that exercised it for no apparent reason.   Removed
these methods as well as the concept of a Compiler
holding onto a "bind".

Fixes: #5386

Change-Id: I990b43aab96b42665af1b2187ad6020bee778784

Convert execution to move through Session

2020-05-25T17:56:37+00:00

This patch replaces the ORM execution flow with a
single pathway through Session.execute() for all queries,
including Core and ORM.

Currently included is full support for ORM Query,
Query.from_statement(), select(), as well as the
baked query and horizontal shard systems.  Initial
changes have also been made to the dogpile caching
example, which like baked query makes use of a
new ORM-specific execution hook that replaces the
use of both QueryEvents.before_compile() as well
as Query._execute_and_instances() as the central
ORM interception hooks.

select() and Query() constructs alike can be passed to
Session.execute() where they will return ORM
results in a Results object.   This API is currently
used internally by Query.   Full support for
Session.execute()->results to behave in a fully
2.0 fashion will be in later changesets.

bulk update/delete with ORM support will also
be delivered via the update() and delete()
constructs, however these have not yet been adapted
to the new system and may follow in a subsequent
update.

Performance is also beginning to lag as of this
commit and some previous ones.   It is hoped that
a few central functions such as the coercions
functions can be rewritten in C to re-gain
performance.  Additionally, query caching
is now available and some subsequent patches
will attempt to cache more of the per-execution
work from the ORM layer, e.g. column getters
and adapters.

This patch also contains initial "turn on" of the
caching system enginewide via the query_cache_size
parameter to create_engine(). Still defaulting at
zero for "no caching".   The caching system still
needs adjustments in order to gain adequate performance.

Change-Id: I047a7ebb26aa85dc01f6789fac2bff561dcd555d

Unify Query and select() , move all processing to compile phase

2020-05-24T15:54:08+00:00

Convert Query to do virtually all compile state computation
in the _compile_context() phase, and organize it all
such that a plain select() construct may also be used as the
source of information in order to generate ORM query state.
This makes it such that Query is not needed except for
its additional methods like from_self() which are all to
be deprecated.

The construction of ORM state will occur beyond the
caching boundary when the new execution model is integrated.

future select() gains a working join() and filter_by() method.
as we continue to rebase and merge each commit in the steps,
callcounts continue to bump around.  will have to look at
the final result when it's all in.

References: #5159
References: #4705
References: #4639
References: #4871
References: #5010

Change-Id: I19e05b3424b07114cce6c439b05198ac47f7ac10

Run search and replace of symbolic module names

2020-04-14T17:15:21+00:00

Replaces a wide array of Sphinx-relative doc references
with an abbreviated absolute form now supported by
zzzeeksphinx.

Change-Id: I94bffcc3f37885ffdde6238767224296339698a2

Try to measure new style caching in the ORM, take two

2020-04-01T20:12:23+00:00

Supercedes: If78fbb557c6f2cae637799c3fec2cbc5ac248aaf

Trying to see if by making the cache key memoized, we
still can have the older "identity" form of caching
which is the cheapest of all, at the same time as the
newer "cache key each time" version that is not nearly
as cheap; but still much cheaper than no caching at all.

Also needed is a per-execution update of _keymap when
we invoke from a cached select, so that Column objects
that are anonymous or otherwise adapted will match up.
this is analogous to the adaption of bound parameters
from the cache key.

Adds test coverage for the keymap / construct_params()
 changes related to caching.  Also hones performance
to a large extent for statement construction and
cache key generation.

Also includes a new memoized attribute
approach that vastly simplifies the previous approach
of "group_expirable_memoized_property" and finally
integrates cleanly with _clone(), _generate(), etc.
no more hardcoding of attributes is needed, as well
as that most _reset_memoization() calls are no longer
needed as the reset is inherent in a _generate() call;
this also has dramatic performance improvements.

Change-Id: I95c560ffcbfa30b26644999412fb6a385125f663

Ensure descendants of mixins don't become cacheable

2020-02-22T18:11:20+00:00

HasPrefix / HasSuffixes / SupportsCloneAnnotations exported
a _traverse_internals attribute that does not represent a
complete traversal, meaning non-traversible subclasses would
seem traversible.  rename these attributes so that this
does not occur.  DML is currently not traversible (will be soon).

Change-Id: I2605e61c8c3d49965335e66e09f4aeedc5e73bd3

happy new year

2020-01-01T17:09:47+00:00

Change-Id: I08440dc25e40ea1ccea1778f6ee9e28a00808235

Test for short term reference cycles and resolve as many as possible

2019-12-30T19:07:18+00:00

Added test support and repaired a wide variety of unnecessary reference
cycles created for short-lived objects, mostly in the area of ORM queries.

Fixes: #5056
Change-Id: Ifd93856eba550483f95f9ae63d49f36ab068b85a

Traversal and clause generation performance improvements

2019-12-14T19:28:01+00:00

Added one traversal test, callcounts have been brought from 29754 to
5173 so far.

Change-Id: I164e9831600709ee214c1379bb215fdad73b39aa

Add anonymizing context to cache keys, comparison; convert traversal

2019-11-04T18:22:43+00:00

Created new visitor system called "internal traversal" that
applies a data driven approach to the concept of a class that
defines its own traversal steps, in contrast to the existing
style of traversal now known as "external traversal" where
the visitor class defines the traversal, i.e. the SQLCompiler.

The internal traversal system now implements get_children(),
_copy_internals(), compare() and _cache_key() for most Core elements.
Core elements with special needs like Select still implement
some of these methods directly however most of these methods
are no longer explicitly implemented.

The data-driven system is also applied to ORM elements that
take part in SQL expressions so that these objects, like mappers,
aliasedclass, query options, etc. can all participate in the
cache key process.

Still not considered is that this approach to defining traversibility
will be used to create some kind of generic introspection system
that works across Core / ORM.  It's also not clear if
real statement caching using the _cache_key() method is feasible,
if it is shown that running _cache_key() is nearly as expensive as
compiling in any case.    Because it is data driven, it is more
straightforward to optimize using inlined code, as is the case now,
as well as potentially using C code to speed it up.

In addition, the caching sytem now accommodates for anonymous
name labels, which is essential so that constructs which have
anonymous labels can be cacheable, that is, their position
within a statement in relation to other anonymous names causes
them to generate an integer counter relative to that construct
which will be the same every time.   Gathering of bound parameters
from any cache key generation is also now required as there is
no use case for a cache key that does not extract bound parameter
values.

Applies-to: #4639
Change-Id: I0660584def8627cad566719ee98d3be045db4b8d