diff options
author | Mike Bayer <mike_mp@zzzcomputing.com> | 2012-10-25 15:20:36 -0400 |
---|---|---|
committer | Mike Bayer <mike_mp@zzzcomputing.com> | 2012-10-25 15:20:36 -0400 |
commit | 67f8e6a7fafe4b38e64b53b932b1413639ced401 (patch) | |
tree | d30a3399f311e57b934404ed04150dc5866b933b /doc/build/changelog/migration_06.rst | |
parent | 7c6bde1c15c97cae34ef9449aa595168910fe87d (diff) | |
download | sqlalchemy-67f8e6a7fafe4b38e64b53b932b1413639ced401.tar.gz |
add changelogs
Diffstat (limited to 'doc/build/changelog/migration_06.rst')
-rw-r--r-- | doc/build/changelog/migration_06.rst | 1228 |
1 files changed, 1228 insertions, 0 deletions
diff --git a/doc/build/changelog/migration_06.rst b/doc/build/changelog/migration_06.rst new file mode 100644 index 000000000..d561452e2 --- /dev/null +++ b/doc/build/changelog/migration_06.rst @@ -0,0 +1,1228 @@ +============================== +What's New in SQLAlchemy 0.6 ? +============================== + +.. admonition:: About this Document + + This document describes changes between SQLAlchemy version 0.5, + last released January 16, 2010, and SQLAlchemy version 0.6, + last released May 5, 2012. + + Document date: June 6, 2010 + +This guide documents API changes which affect users +migrating their applications from the 0.5 series of +SQLAlchemy to 0.6. Note that SQLAlchemy 0.6 removes some +behaviors which were deprecated throughout the span of the +0.5 series, and also deprecates more behaviors specific to +0.5. + +Platform Support +================ + +* cPython versions 2.4 and upwards throughout the 2.xx + series + +* Jython 2.5.1 - using the zxJDBC DBAPI included with + Jython. + +* cPython 3.x - see [source:sqlalchemy/trunk/README.py3k] + for information on how to build for python3. + +New Dialect System +================== + +Dialect modules are now broken up into distinct +subcomponents, within the scope of a single database +backend. Dialect implementations are now in the +``sqlalchemy.dialects`` package. The +``sqlalchemy.databases`` package still exists as a +placeholder to provide some level of backwards compatibility +for simple imports. + +For each supported database, a sub-package exists within +``sqlalchemy.dialects`` where several files are contained. +Each package contains a module called ``base.py`` which +defines the specific SQL dialect used by that database. It +also contains one or more "driver" modules, each one +corresponding to a specific DBAPI - these files are named +corresponding to the DBAPI itself, such as ``pysqlite``, +``cx_oracle``, or ``pyodbc``. The classes used by +SQLAlchemy dialects are first declared in the ``base.py`` +module, defining all behavioral characteristics defined by +the database. These include capability mappings, such as +"supports sequences", "supports returning", etc., type +definitions, and SQL compilation rules. Each "driver" +module in turn provides subclasses of those classes as +needed which override the default behavior to accommodate +the additional features, behaviors, and quirks of that +DBAPI. For DBAPIs that support multiple backends (pyodbc, +zxJDBC, mxODBC), the dialect module will use mixins from the +``sqlalchemy.connectors`` package, which provide +functionality common to that DBAPI across all backends, most +typically dealing with connect arguments. This means that +connecting using pyodbc, zxJDBC or mxODBC (when implemented) +is extremely consistent across supported backends. + +The URL format used by ``create_engine()`` has been enhanced +to handle any number of DBAPIs for a particular backend, +using a scheme that is inspired by that of JDBC. The +previous format still works, and will select a "default" +DBAPI implementation, such as the Postgresql URL below that +will use psycopg2: + +:: + + create_engine('postgresql://scott:tiger@localhost/test') + +However to specify a specific DBAPI backend such as pg8000, +add it to the "protocol" section of the URL using a plus +sign "+": + +:: + + create_engine('postgresql+pg8000://scott:tiger@localhost/test') + +Important Dialect Links: + +* Documentation on connect arguments: + http://www.sqlalchemy.org/docs/06/dbengine.html#create- + engine-url-arguments. + +* Reference documentation for individual dialects: http://ww + w.sqlalchemy.org/docs/06/reference/dialects/index.html + +* The tips and tricks at DatabaseNotes. + + +Other notes regarding dialects: + +* the type system has been changed dramatically in + SQLAlchemy 0.6. This has an impact on all dialects + regarding naming conventions, behaviors, and + implementations. See the section on "Types" below. + +* the ``ResultProxy`` object now offers a 2x speed + improvement in some cases thanks to some refactorings. + +* the ``RowProxy``, i.e. individual result row object, is + now directly pickleable. + +* the setuptools entrypoint used to locate external dialects + is now called ``sqlalchemy.dialects``. An external + dialect written against 0.4 or 0.5 will need to be + modified to work with 0.6 in any case so this change does + not add any additional difficulties. + +* dialects now receive an initialize() event on initial + connection to determine connection properties. + +* Functions and operators generated by the compiler now use + (almost) regular dispatch functions of the form + "visit_<opname>" and "visit_<funcname>_fn" to provide + customed processing. This replaces the need to copy the + "functions" and "operators" dictionaries in compiler + subclasses with straightforward visitor methods, and also + allows compiler subclasses complete control over + rendering, as the full _Function or _BinaryExpression + object is passed in. + +Dialect Imports +--------------- + +The import structure of dialects has changed. Each dialect +now exports its base "dialect" class as well as the full set +of SQL types supported on that dialect via +``sqlalchemy.dialects.<name>``. For example, to import a +set of PG types: + +:: + + from sqlalchemy.dialects.postgresql import INTEGER, BIGINT, SMALLINT,\ + VARCHAR, MACADDR, DATE, BYTEA + +Above, ``INTEGER`` is actually the plain ``INTEGER`` type +from ``sqlalchemy.types``, but the PG dialect makes it +available in the same way as those types which are specific +to PG, such as ``BYTEA`` and ``MACADDR``. + +Expression Language Changes +=========================== + +An Important Expression Language Gotcha +--------------------------------------- + +There's one quite significant behavioral change to the +expression language which may affect some applications. +The boolean value of Python boolean expressions, i.e. +``==``, ``!=``, and similar, now evaluates accurately with +regards to the two clause objects being compared. + +As we know, comparing a ``ClauseElement`` to any other +object returns another ``ClauseElement``: + +:: + + >>> from sqlalchemy.sql import column + >>> column('foo') == 5 + <sqlalchemy.sql.expression._BinaryExpression object at 0x1252490> + +This so that Python expressions produce SQL expressions when +converted to strings: + +:: + + >>> str(column('foo') == 5) + 'foo = :foo_1' + +But what happens if we say this? + +:: + + >>> if column('foo') == 5: + ... print "yes" + ... + +In previous versions of SQLAlchemy, the returned +``_BinaryExpression`` was a plain Python object which +evaluated to ``True``. Now it evaluates to whether or not +the actual ``ClauseElement`` should have the same hash value +as to that being compared. Meaning: + +:: + + >>> bool(column('foo') == 5) + False + >>> bool(column('foo') == column('foo')) + False + >>> c = column('foo') + >>> bool(c == c) + True + >>> + +That means code such as the following: + +:: + + if expression: + print "the expression is:", expression + +Would not evaluate if ``expression`` was a binary clause. +Since the above pattern should never be used, the base +``ClauseElement`` now raises an exception if called in a +boolean context: + +:: + + >>> bool(c) + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ... + raise TypeError("Boolean value of this clause is not defined") + TypeError: Boolean value of this clause is not defined + +Code that wants to check for the presence of a +``ClauseElement`` expression should instead say: + +:: + + if expression is not None: + print "the expression is:", expression + +Keep in mind, **this applies to Table and Column objects +too**. + +The rationale for the change is twofold: + +* Comparisons of the form ``if c1 == c2: <do something>`` + can actually be written now + +* Support for correct hashing of ``ClauseElement`` objects + now works on alternate platforms, namely Jython. Up until + this point SQLAlchemy relied heavily on the specific + behavior of cPython in this regard (and still had + occasional problems with it). + +Stricter "executemany" Behavior +------------------------------- + +An "executemany" in SQLAlchemy corresponds to a call to +``execute()``, passing along a collection of bind parameter +sets: + +:: + + connection.execute(table.insert(), {'data':'row1'}, {'data':'row2'}, {'data':'row3'}) + +When the ``Connection`` object sends off the given +``insert()`` construct for compilation, it passes to the +compiler the keynames present in the first set of binds +passed along to determine the construction of the +statement's VALUES clause. Users familiar with this +construct will know that additional keys present in the +remaining dictionaries don't have any impact. What's +different now is that all subsequent dictionaries need to +include at least *every* key that is present in the first +dictionary. This means that a call like this no longer +works: + +:: + + connection.execute(table.insert(), + {'timestamp':today, 'data':'row1'}, + {'timestamp':today, 'data':'row2'}, + {'data':'row3'}) + +Because the third row does not specify the 'timestamp' +column. Previous versions of SQLAlchemy would simply insert +NULL for these missing columns. However, if the +``timestamp`` column in the above example contained a +Python-side default value or function, it would *not* be +used. This because the "executemany" operation is optimized +for maximum performance across huge numbers of parameter +sets, and does not attempt to evaluate Python-side defaults +for those missing keys. Because defaults are often +implemented either as SQL expressions which are embedded +inline with the INSERT statement, or are server side +expressions which again are triggered based on the structure +of the INSERT string, which by definition cannot fire off +conditionally based on each parameter set, it would be +inconsistent for Python side defaults to behave differently +vs. SQL/server side defaults. (SQL expression based +defaults are embedded inline as of the 0.5 series, again to +minimize the impact of huge numbers of parameter sets). + +SQLAlchemy 0.6 therefore establishes predictable consistency +by forbidding any subsequent parameter sets from leaving any +fields blank. That way, there's no more silent failure of +Python side default values and functions, which additionally +are allowed to remain consistent in their behavior versus +SQL and server side defaults. + +UNION and other "compound" constructs parenthesize consistently +--------------------------------------------------------------- + +A rule that was designed to help SQLite has been removed, +that of the first compound element within another compound +(such as, a ``union()`` inside of an ``except_()``) wouldn't +be parenthesized. This is inconsistent and produces the +wrong results on Postgresql, which has precedence rules +regarding INTERSECTION, and its generally a surprise. When +using complex composites with SQLite, you now need to turn +the first element into a subquery (which is also compatible +on PG). A new example is in the SQL expression tutorial at +the end of +[http://www.sqlalchemy.org/docs/06/sqlexpression.html +#unions-and-other-set-operations]. See :ticket:`1665` and +r6690 for more background. + +C Extensions for Result Fetching +================================ + +The ``ResultProxy`` and related elements, including most +common "row processing" functions such as unicode +conversion, numerical/boolean conversions and date parsing, +have been re-implemented as optional C extensions for the +purposes of performance. This represents the beginning of +SQLAlchemy's path to the "dark side" where we hope to +continue improving performance by reimplementing critical +sections in C. The extensions can be built by specifying +``--with-cextensions``, i.e. ``python setup.py --with- +cextensions install``. + +The extensions have the most dramatic impact on result +fetching using direct ``ResultProxy`` access, i.e. that +which is returned by ``engine.execute()``, +``connection.execute()``, or ``session.execute()``. Within +results returned by an ORM ``Query`` object, result fetching +is not as high a percentage of overhead, so ORM performance +improves more modestly, and mostly in the realm of fetching +large result sets. The performance improvements highly +depend on the dbapi in use and on the syntax used to access +the columns of each row (eg ``row['name']`` is much faster +than ``row.name``). The current extensions have no impact +on the speed of inserts/updates/deletes, nor do they improve +the latency of SQL execution, that is, an application that +spends most of its time executing many statements with very +small result sets will not see much improvement. + +Performance has been improved in 0.6 versus 0.5 regardless +of the extensions. A quick overview of what connecting and +fetching 50,000 rows looks like with SQLite, using mostly +direct SQLite access, a ``ResultProxy``, and a simple mapped +ORM object: + +:: + + sqlite select/native: 0.260s + + 0.6 / C extension + + sqlalchemy.sql select: 0.360s + sqlalchemy.orm fetch: 2.500s + + 0.6 / Pure Python + + sqlalchemy.sql select: 0.600s + sqlalchemy.orm fetch: 3.000s + + 0.5 / Pure Python + + sqlalchemy.sql select: 0.790s + sqlalchemy.orm fetch: 4.030s + +Above, the ORM fetches the rows 33% faster than 0.5 due to +in-python performance enhancements. With the C extensions +we get another 20%. However, ``ResultProxy`` fetches +improve by 67% with the C extension versus not. Other +tests report as much as a 200% speed improvement for some +scenarios, such as those where lots of string conversions +are occurring. + +New Schema Capabilities +======================= + +The ``sqlalchemy.schema`` package has received some long- +needed attention. The most visible change is the newly +expanded DDL system. In SQLAlchemy, it was possible since +version 0.5 to create custom DDL strings and associate them +with tables or metadata objects: + +:: + + from sqlalchemy.schema import DDL + + DDL('CREATE TRIGGER users_trigger ...').execute_at('after-create', metadata) + +Now the full suite of DDL constructs are available under the +same system, including those for CREATE TABLE, ADD +CONSTRAINT, etc.: + +:: + + from sqlalchemy.schema import Constraint, AddConstraint + + AddContraint(CheckConstraint("value > 5")).execute_at('after-create', mytable) + +Additionally, all the DDL objects are now regular +``ClauseElement`` objects just like any other SQLAlchemy +expression object: + +:: + + from sqlalchemy.schema import CreateTable + + create = CreateTable(mytable) + + # dumps the CREATE TABLE as a string + print create + + # executes the CREATE TABLE statement + engine.execute(create) + +and using the ``sqlalchemy.ext.compiler`` extension you can +make your own: + +:: + + from sqlalchemy.schema import DDLElement + from sqlalchemy.ext.compiler import compiles + + class AlterColumn(DDLElement): + + def __init__(self, column, cmd): + self.column = column + self.cmd = cmd + + @compiles(AlterColumn) + def visit_alter_column(element, compiler, **kw): + return "ALTER TABLE %s ALTER COLUMN %s %s ..." % ( + element.column.table.name, + element.column.name, + element.cmd + ) + + engine.execute(AlterColumn(table.c.mycolumn, "SET DEFAULT 'test'")) + +Deprecated/Removed Schema Elements +---------------------------------- + +The schema package has also been greatly streamlined. Many +options and methods which were deprecated throughout 0.5 +have been removed. Other little known accessors and methods +have also been removed. + +* the "owner" keyword argument is removed from ``Table``. + Use "schema" to represent any namespaces to be prepended + to the table name. + +* deprecated ``MetaData.connect()`` and + ``ThreadLocalMetaData.connect()`` have been removed - send + the "bind" attribute to bind a metadata. + +* deprecated metadata.table_iterator() method removed (use + sorted_tables) + +* the "metadata" argument is removed from + ``DefaultGenerator`` and subclasses, but remains locally + present on ``Sequence``, which is a standalone construct + in DDL. + +* deprecated ``PassiveDefault`` - use ``DefaultClause``. + + +* Removed public mutability from ``Index`` and + ``Constraint`` objects: + + * ``ForeignKeyConstraint.append_element()`` + + + * ``Index.append_column()`` + + + * ``UniqueConstraint.append_column()`` + + + * ``PrimaryKeyConstraint.add()`` + + + * ``PrimaryKeyConstraint.remove()`` + + +These should be constructed declaratively (i.e. in one +construction). + +* Other removed things: + + + * ``Table.key`` (no idea what this was for) + + + * ``Column.bind`` (get via column.table.bind) + + + * ``Column.metadata`` (get via column.table.metadata) + + + * ``Column.sequence`` (use column.default) + + +Other Behavioral Changes +------------------------ + +* ``UniqueConstraint``, ``Index``, ``PrimaryKeyConstraint`` + all accept lists of column names or column objects as + arguments. + +* The ``use_alter`` flag on ``ForeignKey`` is now a shortcut + option for operations that can be hand-constructed using + the ``DDL()`` event system. A side effect of this refactor + is that ``ForeignKeyConstraint`` objects with + ``use_alter=True`` will *not* be emitted on SQLite, which + does not support ALTER for foreign keys. This has no + effect on SQLite's behavior since SQLite does not actually + honor FOREIGN KEY constraints. + +* ``Table.primary_key`` is not assignable - use + ``table.append_constraint(PrimaryKeyConstraint(...))`` + +* A ``Column`` definition with a ``ForeignKey`` and no type, + e.g. ``Column(name, ForeignKey(sometable.c.somecol))`` + used to get the type of the referenced column. Now support + for that automatic type inference is partial and may not + work in all cases. + +Logging opened up +================= + +At the expense of a few extra method calls here and there, +you can set log levels for INFO and DEBUG after an engine, +pool, or mapper has been created, and logging will commence. +The ``isEnabledFor(INFO)`` method is now called +per-``Connection`` and ``isEnabledFor(DEBUG)`` +per-``ResultProxy`` if already enabled on the parent +connection. Pool logging sends to ``log.info()`` and +``log.debug()`` with no check - note that pool +checkout/checkin is typically once per transaction. + +Reflection/Inspector API +======================== + +The reflection system, which allows reflection of table +columns via ``Table('sometable', metadata, autoload=True)`` +has been opened up into its own fine-grained API, which +allows direct inspection of database elements such as +tables, columns, constraints, indexes, and more. This API +expresses return values as simple lists of strings, +dictionaries, and ``TypeEngine`` objects. The internals of +``autoload=True`` now build upon this system such that the +translation of raw database information into +``sqlalchemy.schema`` constructs is centralized and the +contract of individual dialects greatly simplified, vastly +reducing bugs and inconsistencies across different backends. + +To use an inspector: + +:: + + from sqlalchemy.engine.reflection import Inspector + insp = Inspector.from_engine(my_engine) + + print insp.get_schema_names() + +the ``from_engine()`` method will in some cases provide a +backend-specific inspector with additional capabilities, +such as that of Postgresql which provides a +``get_table_oid()`` method: + +:: + + + my_engine = create_engine('postgresql://...') + pg_insp = Inspector.from_engine(my_engine) + + print pg_insp.get_table_oid('my_table') + +RETURNING Support +================= + +The ``insert()``, ``update()`` and ``delete()`` constructs +now support a ``returning()`` method, which corresponds to +the SQL RETURNING clause as supported by Postgresql, Oracle, +MS-SQL, and Firebird. It is not supported for any other +backend at this time. + +Given a list of column expressions in the same manner as +that of a ``select()`` construct, the values of these +columns will be returned as a regular result set: + +:: + + + result = connection.execute( + table.insert().values(data='some data').returning(table.c.id, table.c.timestamp) + ) + row = result.first() + print "ID:", row['id'], "Timestamp:", row['timestamp'] + +The implementation of RETURNING across the four supported +backends varies wildly, in the case of Oracle requiring an +intricate usage of OUT parameters which are re-routed into a +"mock" result set, and in the case of MS-SQL using an +awkward SQL syntax. The usage of RETURNING is subject to +limitations: + +* it does not work for any "executemany()" style of + execution. This is a limitation of all supported DBAPIs. + +* Some backends, such as Oracle, only support RETURNING that + returns a single row - this includes UPDATE and DELETE + statements, meaning the update() or delete() construct + must match only a single row, or an error is raised (by + Oracle, not SQLAlchemy). + +RETURNING is also used automatically by SQLAlchemy, when +available and when not otherwise specified by an explicit +``returning()`` call, to fetch the value of newly generated +primary key values for single-row INSERT statements. This +means there's no more "SELECT nextval(sequence)" pre- +execution for insert statements where the primary key value +is required. Truth be told, implicit RETURNING feature +does incur more method overhead than the old "select +nextval()" system, which used a quick and dirty +cursor.execute() to get at the sequence value, and in the +case of Oracle requires additional binding of out +parameters. So if method/protocol overhead is proving to be +more expensive than additional database round trips, the +feature can be disabled by specifying +``implicit_returning=False`` to ``create_engine()``. + +Type System Changes +=================== + +New Archicture +-------------- + +The type system has been completely reworked behind the +scenes to provide two goals: + +* Separate the handling of bind parameters and result row + values, typically a DBAPI requirement, from the SQL + specification of the type itself, which is a database + requirement. This is consistent with the overall dialect + refactor that separates database SQL behavior from DBAPI. + +* Establish a clear and consistent contract for generating + DDL from a ``TypeEngine`` object and for constructing + ``TypeEngine`` objects based on column reflection. + +Highlights of these changes include: + +* The construction of types within dialects has been totally + overhauled. Dialects now define publically available types + as UPPERCASE names exclusively, and internal + implementation types using underscore identifiers (i.e. + are private). The system by which types are expressed in + SQL and DDL has been moved to the compiler system. This + has the effect that there are much fewer type objects + within most dialects. A detailed document on this + architecture for dialect authors is in [source:/lib/sqlalc + hemy/dialects/type_migration_guidelines.txt]. + +* Reflection of types now returns the exact UPPERCASE type + within types.py, or the UPPERCASE type within the dialect + itself if the type is not a standard SQL type. This means + reflection now returns more accurate information about + reflected types. + +* User defined types that subclass ``TypeEngine`` and wish + to provide ``get_col_spec()`` should now subclass + ``UserDefinedType``. + +* The ``result_processor()`` method on all type classes now + accepts an additional argument ``coltype``. This is the + DBAPI type object attached to cursor.description, and + should be used when applicable to make better decisions on + what kind of result-processing callable should be + returned. Ideally result processor functions would never + need to use ``isinstance()``, which is an expensive call + at this level. + +Native Unicode Mode +------------------- + +As more DBAPIs support returning Python unicode objects +directly, the base dialect now performs a check upon the +first connection which establishes whether or not the DBAPI +returns a Python unicode object for a basic select of a +VARCHAR value. If so, the ``String`` type and all +subclasses (i.e. ``Text``, ``Unicode``, etc.) will skip the +"unicode" check/conversion step when result rows are +received. This offers a dramatic performance increase for +large result sets. The "unicode mode" currently is known to +work with: + +* sqlite3 / pysqlite + + +* psycopg2 - SQLA 0.6 now uses the "UNICODE" type extension + by default on each psycopg2 connection object + +* pg8000 + + +* cx_oracle (we use an output processor - nice feature !) + + +Other types may choose to disable unicode processing as +needed, such as the ``NVARCHAR`` type when used with MS-SQL. + +In particular, if porting an application based on a DBAPI +that formerly returned non-unicode strings, the "native +unicode" mode has a plainly different default behavior - +columns that are declared as ``String`` or ``VARCHAR`` now +return unicode by default whereas they would return strings +before. This can break code which expects non-unicode +strings. The psycopg2 "native unicode" mode can be +disabled by passing ``use_native_unicode=False`` to +``create_engine()``. + +A more general solution for string columns that explicitly +do not want a unicode object is to use a ``TypeDecorator`` +that converts unicode back to utf-8, or whatever is desired: + +:: + + class UTF8Encoded(TypeDecorator): + """Unicode type which coerces to utf-8.""" + + impl = sa.VARCHAR + + def process_result_value(self, value, dialect): + if isinstance(value, unicode): + value = value.encode('utf-8') + return value + +Note that the ``assert_unicode`` flag is now deprecated. +SQLAlchemy allows the DBAPI and backend database in use to +handle Unicode parameters when available, and does not add +operational overhead by checking the incoming type; modern +systems like sqlite and Postgresql will raise an encoding +error on their end if invalid data is passed. In those +cases where SQLAlchemy does need to coerce a bind parameter +from Python Unicode to an encoded string, or when the +Unicode type is used explicitly, a warning is raised if the +object is a bytestring. This warning can be suppressed or +converted to an exception using the Python warnings filter +documented at: http://docs.python.org/library/warnings.html + +Generic Enum Type +----------------- + +We now have an ``Enum`` in the ``types`` module. This is a +string type that is given a collection of "labels" which +constrain the possible values given to those labels. By +default, this type generates a ``VARCHAR`` using the size of +the largest label, and applies a CHECK constraint to the +table within the CREATE TABLE statement. When using MySQL, +the type by default uses MySQL's ENUM type, and when using +Postgresql the type will generate a user defined type using +``CREATE TYPE <mytype> AS ENUM``. In order to create the +type using Postgresql, the ``name`` parameter must be +specified to the constructor. The type also accepts a +``native_enum=False`` option which will issue the +VARCHAR/CHECK strategy for all databases. Note that +Postgresql ENUM types currently don't work with pg8000 or +zxjdbc. + +Reflection Returns Dialect-Specific Types +----------------------------------------- + +Reflection now returns the most specific type possible from +the database. That is, if you create a table using +``String``, then reflect it back, the reflected column will +likely be ``VARCHAR``. For dialects that support a more +specific form of the type, that's what you'll get. So a +``Text`` type would come back as ``oracle.CLOB`` on Oracle, +a ``LargeBinary`` might be an ``mysql.MEDIUMBLOB`` etc. The +obvious advantage here is that reflection preserves as much +information possible from what the database had to say. + +Some applications that deal heavily in table metadata may +wish to compare types across reflected tables and/or non- +reflected tables. There's a semi-private accessor available +on ``TypeEngine`` called ``_type_affinity`` and an +associated comparison helper ``_compare_type_affinity``. +This accessor returns the "generic" ``types`` class which +the type corresponds to: + +:: + + >>> String(50)._compare_type_affinity(postgresql.VARCHAR(50)) + True + >>> Integer()._compare_type_affinity(mysql.REAL) + False + +Miscellaneous API Changes +------------------------- + +The usual "generic" types are still the general system in +use, i.e. ``String``, ``Float``, ``DateTime``. There's a +few changes there: + +* Types no longer make any guesses as to default parameters. + In particular, ``Numeric``, ``Float``, as well as + subclasses NUMERIC, FLOAT, DECIMAL don't generate any + length or scale unless specified. This also continues to + include the controversial ``String`` and ``VARCHAR`` types + (although MySQL dialect will pre-emptively raise when + asked to render VARCHAR with no length). No defaults are + assumed, and if they are used in a CREATE TABLE statement, + an error will be raised if the underlying database does + not allow non-lengthed versions of these types. + +* the ``Binary`` type has been renamed to ``LargeBinary``, + for BLOB/BYTEA/similar types. For ``BINARY`` and + ``VARBINARY``, those are present directly as + ``types.BINARY``, ``types.VARBINARY``, as well as in the + MySQL and MS-SQL dialects. + +* ``PickleType`` now uses == for comparison of values when + mutable=True, unless the "comparator" argument with a + comparison function is specified to the type. If you are + pickling a custom object you should implement an + ``__eq__()`` method so that value-based comparisons are + accurate. + +* The default "precision" and "scale" arguments of Numeric + and Float have been removed and now default to None. + NUMERIC and FLOAT will be rendered with no numeric + arguments by default unless these values are provided. + +* DATE, TIME and DATETIME types on SQLite can now take + optional "storage_format" and "regexp" argument. + "storage_format" can be used to store those types using a + custom string format. "regexp" allows to use a custom + regular expression to match string values from the + database. + +* ``__legacy_microseconds__`` on SQLite ``Time`` and + ``DateTime`` types is not supported anymore. You should + use the new "storage_format" argument instead. + +* ``DateTime`` types on SQLite now use by a default a + stricter regular expression to match strings from the + database. Use the new "regexp" argument if you are using + data stored in a legacy format. + +ORM Changes +=========== + +Upgrading an ORM application from 0.5 to 0.6 should require +little to no changes, as the ORM's behavior remains almost +identical. There are some default argument and name +changes, and some loading behaviors have been improved. + +New Unit of Work +---------------- + +The internals for the unit of work, primarily +``topological.py`` and ``unitofwork.py``, have been +completely rewritten and are vastly simplified. This +should have no impact on usage, as all existing behavior +during flush has been maintained exactly (or at least, as +far as it is exercised by our testsuite and the handful of +production environments which have tested it heavily). The +performance of flush() now uses 20-30% fewer method calls +and should also use less memory. The intent and flow of the +source code should now be reasonably easy to follow, and the +architecture of the flush is fairly open-ended at this +point, creating room for potential new areas of +sophistication. The flush process no longer has any +reliance on recursion so flush plans of arbitrary size and +complexity can be flushed. Additionally, the mapper's +"save" process, which issues INSERT and UPDATE statements, +now caches the "compiled" form of the two statements so that +callcounts are further dramatically reduced with very large +flushes. + +Any changes in behavior observed with flush versus earlier +versions of 0.6 or 0.5 should be reported to us ASAP - we'll +make sure no functionality is lost. + +Changes to ``query.update()`` and ``query.delete()`` +---------------------------------------------------- + +* the 'expire' option on query.update() has been renamed to + 'fetch', thus matching that of query.delete() + +* ``query.update()`` and ``query.delete()`` both default to + 'evaluate' for the synchronize strategy. + +* the 'synchronize' strategy for update() and delete() + raises an error on failure. There is no implicit fallback + onto "fetch". Failure of evaluation is based on the + structure of criteria, so success/failure is deterministic + based on code structure. + +``relation()`` is officially named ``relationship()`` +----------------------------------------------------- + +This to solve the long running issue that "relation" means a +"table or derived table" in relational algebra terms. The +``relation()`` name, which is less typing, will hang around +for the foreseeable future so this change should be entirely +painless. + +Subquery eager loading +---------------------- + +A new kind of eager loading is added called "subquery" +loading. This is a load that emits a second SQL query +immediately after the first which loads full collections for +all the parents in the first query, joining upwards to the +parent using INNER JOIN. Subquery loading is used simlarly +to the current joined-eager loading, using the +```subqueryload()```` and ````subqueryload_all()```` options +as well as the ````lazy='subquery'```` setting on +````relationship()```. The subquery load is usually much +more efficient for loading many larger collections as it +uses INNER JOIN unconditionally and also doesn't re-load +parent rows. + +```eagerload()````, ````eagerload_all()```` is now ````joinedload()````, ````joinedload_all()``` +------------------------------------------------------------------------------------------------ + +To make room for the new subquery load feature, the existing +```eagerload()````/````eagerload_all()```` options are now +superceded by ````joinedload()```` and +````joinedload_all()````. The old names will hang around +for the foreseeable future just like ````relation()```. + +```lazy=False|None|True|'dynamic'```` now accepts ````lazy='noload'|'joined'|'subquery'|'select'|'dynamic'``` +------------------------------------------------------------------------------------------------------------- + +Continuing on the theme of loader strategies opened up, the +standard keywords for the ```lazy```` option on +````relationship()```` are now ````select```` for lazy +loading (via a SELECT issued on attribute access), +````joined```` for joined-eager loading, ````subquery```` +for subquery-eager loading, ````noload```` for no loading +should occur, and ````dynamic```` for a "dynamic" +relationship. The old ````True````, ````False````, +````None``` arguments are still accepted with the identical +behavior as before. + +innerjoin=True on relation, joinedload +-------------------------------------- + +Joined-eagerly loaded scalars and collections can now be +instructed to use INNER JOIN instead of OUTER JOIN. On +Postgresql this is observed to provide a 300-600% speedup on +some queries. Set this flag for any many-to-one which is +on a NOT NULLable foreign key, and similarly for any +collection where related items are guaranteed to exist. + +At mapper level: + +:: + + mapper(Child, child) + mapper(Parent, parent, properties={ + 'child':relationship(Child, lazy='joined', innerjoin=True) + }) + +At query time level: + +:: + + session.query(Parent).options(joinedload(Parent.child, innerjoin=True)).all() + +The ``innerjoin=True`` flag at the ``relationship()`` level +will also take effect for any ``joinedload()`` option which +does not override the value. + +Many-to-one Enhancements +------------------------ + +* many-to-one relations now fire off a lazyload in fewer + cases, including in most cases will not fetch the "old" + value when a new one is replaced. + +* many-to-one relation to a joined-table subclass now uses + get() for a simple load (known as the "use_get" + condition), i.e. ``Related``->``Sub(Base)``, without the + need to redefine the primaryjoin condition in terms of the + base table. [ticket:1186] + +* specifying a foreign key with a declarative column, i.e. + ``ForeignKey(MyRelatedClass.id)`` doesn't break the + "use_get" condition from taking place [ticket:1492] + +* relationship(), joinedload(), and joinedload_all() now + feature an option called "innerjoin". Specify ``True`` or + ``False`` to control whether an eager join is constructed + as an INNER or OUTER join. Default is ``False`` as always. + The mapper options will override whichever setting is + specified on relationship(). Should generally be set for + many-to-one, not nullable foreign key relations to allow + improved join performance. [ticket:1544] + +* the behavior of joined eager loading such that the main + query is wrapped in a subquery when LIMIT/OFFSET are + present now makes an exception for the case when all eager + loads are many-to-one joins. In those cases, the eager + joins are against the parent table directly along with the + limit/offset without the extra overhead of a subquery, + since a many-to-one join does not add rows to the result. + + For example, in 0.5 this query: + + :: + + session.query(Address).options(eagerload(Address.user)).limit(10) + + would produce SQL like: + + :: + + SELECT * FROM + (SELECT * FROM addresses LIMIT 10) AS anon_1 + LEFT OUTER JOIN users AS users_1 ON users_1.id = anon_1.addresses_user_id + + This because the presence of any eager loaders suggests + that some or all of them may relate to multi-row + collections, which would necessitate wrapping any kind of + rowcount-sensitive modifiers like LIMIT inside of a + subquery. + + In 0.6, that logic is more sensitive and can detect if all + eager loaders represent many-to-ones, in which case the + eager joins don't affect the rowcount: + + :: + + SELECT * FROM addresses LEFT OUTER JOIN users AS users_1 ON users_1.id = addresses.user_id LIMIT 10 + +Mutable Primary Keys with Joined Table Inheritance +-------------------------------------------------- + +A joined table inheritance config where the child table has +a PK that foreign keys to the parent PK can now be updated +on a CASCADE-capable database like Postgresql. +``mapper()`` now has an option ``passive_updates=True`` +which indicates this foreign key is updated automatically. +If on a non-cascading database like SQLite or MySQL/MyISAM, +set this flag to ``False``. A future feature enhancement +will try to get this flag to be auto-configuring based on +dialect/table style in use. + +Beaker Caching +-------------- + +A promising new example of Beaker integration is in +``examples/beaker_caching``. This is a straightforward +recipe which applies a Beaker cache within the result- +generation engine of ``Query``. Cache parameters are +provided via ``query.options()``, and allows full control +over the contents of the cache. SQLAlchemy 0.6 includes +improvements to the ``Session.merge()`` method to support +this and similar recipes, as well as to provide +significantly improved performance in most scenarios. + +Other Changes +------------- + +* the "row tuple" object returned by ``Query`` when multiple + column/entities are selected is now picklable as well as + higher performing. + +* ``query.join()`` has been reworked to provide more + consistent behavior and more flexibility (includes + [ticket:1537]) + +* ``query.select_from()`` accepts multiple clauses to + produce multiple comma separated entries within the FROM + clause. Useful when selecting from multiple-homed join() + clauses. + +* the "dont_load=True" flag on ``Session.merge()`` is + deprecated and is now "load=False". + +* added "make_transient()" helper function which transforms + a persistent/ detached instance into a transient one (i.e. + deletes the instance_key and removes from any session.) + [ticket:1052] + +* the allow_null_pks flag on mapper() is deprecated and has + been renamed to allow_partial_pks. It is turned "on" by + default. This means that a row which has a non-null value + for any of its primary key columns will be considered an + identity. The need for this scenario typically only occurs + when mapping to an outer join. When set to False, a PK + that has NULLs in it will not be considered a primary key + - in particular this means a result row will come back as + None (or not be filled into a collection), and new in 0.6 + also indicates that session.merge() won't issue a round + trip to the database for such a PK value. [ticket:1680] + +* the mechanics of "backref" have been fully merged into the + finer grained "back_populates" system, and take place + entirely within the ``_generate_backref()`` method of + ``RelationProperty``. This makes the initialization + procedure of ``RelationProperty`` simpler and allows + easier propagation of settings (such as from subclasses of + ``RelationProperty``) into the reverse reference. The + internal ``BackRef()`` is gone and ``backref()`` returns a + plain tuple that is understood by ``RelationProperty``. + +* the keys attribute of ``ResultProxy`` is now a method, so + references to it (``result.keys``) must be changed to + method invocations (``result.keys()``) + +* ``ResultProxy.last_inserted_ids`` is now deprecated, use + ``ResultProxy.inserted_primary_key`` instead. + +Deprecated/Removed ORM Elements +------------------------------- + +Most elements that were deprecated throughout 0.5 and raised +deprecation warnings have been removed (with a few +exceptions). All elements that were marked "pending +deprecation" are now deprecated and will raise a warning +upon use. + +* 'transactional' flag on sessionmaker() and others is + removed. Use 'autocommit=True' to indicate + 'transactional=False'. + +* 'polymorphic_fetch' argument on mapper() is removed. + Loading can be controlled using the 'with_polymorphic' + option. + +* 'select_table' argument on mapper() is removed. Use + 'with_polymorphic=("*", <some selectable>)' for this + functionality. + +* 'proxy' argument on synonym() is removed. This flag did + nothing throughout 0.5, as the "proxy generation" + behavior is now automatic. + +* Passing a single list of elements to joinedload(), + joinedload_all(), contains_eager(), lazyload(), defer(), + and undefer() instead of multiple positional \*args is + deprecated. + +* Passing a single list of elements to query.order_by(), + query.group_by(), query.join(), or query.outerjoin() + instead of multiple positional \*args is deprecated. + +* ``query.iterate_instances()`` is removed. Use + ``query.instances()``. + +* ``Query.query_from_parent()`` is removed. Use the + sqlalchemy.orm.with_parent() function to produce a + "parent" clause, or alternatively ``query.with_parent()``. + +* ``query._from_self()`` is removed, use + ``query.from_self()`` instead. + +* the "comparator" argument to composite() is removed. Use + "comparator_factory". + +* ``RelationProperty._get_join()`` is removed. + + +* the 'echo_uow' flag on Session is removed. Use logging + on the "sqlalchemy.orm.unitofwork" name. + +* ``session.clear()`` is removed. use + ``session.expunge_all()``. + +* ``session.save()``, ``session.update()``, + ``session.save_or_update()`` are removed. Use + ``session.add()`` and ``session.add_all()``. + +* the "objects" flag on session.flush() remains deprecated. + + +* the "dont_load=True" flag on session.merge() is deprecated + in favor of "load=False". + +* ``ScopedSession.mapper`` remains deprecated. See the + usage recipe at http://www.sqlalchemy.org/trac/wiki/Usag + eRecipes/SessionAwareMapper + +* passing an ``InstanceState`` (internal SQLAlchemy state + object) to ``attributes.init_collection()`` or + ``attributes.get_history()`` is deprecated. These + functions are public API and normally expect a regular + mapped object instance. + +* the 'engine' parameter to ``declarative_base()`` is + removed. Use the 'bind' keyword argument. + +Extensions +========== + +SQLSoup +------- + +SQLSoup has been modernized and updated to reflect common +0.5/0.6 capabilities, including well defined session +integration. Please read the new docs at [http://www.sqlalc +hemy.org/docs/06/reference/ext/sqlsoup.html]. + +Declarative +----------- + +The ``DeclarativeMeta`` (default metaclass for +``declarative_base``) previously allowed subclasses to +modify ``dict_`` to add class attributes (e.g. columns). +This no longer works, the ``DeclarativeMeta`` constructor +now ignores ``dict_``. Instead, the class attributes should +be assigned directly, e.g. ``cls.id=Column(...)``, or the +`MixIn class <http://www.sqlalchemy.org/docs/reference/ext/d +eclarative.html#mix-in-classes>`_ approach should be used +instead of the metaclass approach. + |