diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/build/content/adv_datamapping.txt | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/doc/build/content/adv_datamapping.txt b/doc/build/content/adv_datamapping.txt index b16b1853d..50e8980d5 100644 --- a/doc/build/content/adv_datamapping.txt +++ b/doc/build/content/adv_datamapping.txt @@ -251,6 +251,54 @@ Deferred columns can be placed into groups so that they load together: 'photo3' : deferred(book_excerpts.c.photo3, group='photos') }) +#### Working with Large Collections + +(requires some bugfixes released as of version 0.3.3) + +SQLAlchemy relations are generally simplistic; the lazy loader loads in the full list of child objects when accessed, and the eager load builds a query that loads the full list of child objects. Additionally, when you are deleting a parent object, SQLAlchemy insures that it has loaded the full list of child objects so that it can mark them as deleted as well (or to update their parent foreign key to NULL). It does not issue an en-masse "delete from table where parent_id=?" type of statement in such a scenario. This is because the child objects themselves may also have further dependencies, and additionally may also exist in the current session in which case SA needs to know their identity so that their state can be properly updated. + +So there are several techniques that can be used individually or combined together to address these issues, in the context of a large collection where you normally would not want to load the full list of relationships: + +* Use `lazy=None` to disable child object loading (i.e. noload) + + {python} + mapper(MyClass, table, properties=relation{ + 'children':relation(MyObj, lazy=None) + }) + +* To load child objects, just use a query: + + {python} + class Organization(object): + def __init__(self, name): + self.name = name + def find_members(self, criterion): + """locate a subset of the members associated with this Organization""" + return object_session(self).query(Member).select(and_(member_table.c.name.like(criterion), org_table.c.org_id==self.org_id), from_obj=[org_table.join(member_table)]) + +* Use `passive_updates=True` to disable child object loading on a DELETE operation (noload also accomplishes this) +* Use "ON DELETE (CASCADE|SET NULL)" on your database to automatically cascade deletes to child objects (this is the best way to do it) + + {python} + mytable = Table('sometable', meta, + Column('id', Integer, primary_key=True), + Column('parent_id', Integer), + ForeignKeyConstraint(['parent_id'],['parenttable.id'], ondelete="CASCADE"), + ) + +* Alternatively, you can create a simple `MapperExtension` that will issue a DELETE for child objects: + + {python} + class DeleteMemberExt(MapperExtension): + def before_delete(self, mapper, connection, instance): + connection.execute(member_table.delete(member_table.c.org_id==instance.org_id)) + + mapper(Organization, org_table, extension=DeleteMemberExt(), properties = { + 'members' : relation(Member, lazy=None, passive_deletes=True, cascade="all, delete-orphan") + }) + +The latest distribution includes an example `examples/collection/large_collection.py` which illustrates most of these techniques. + #### Relation Options {@name=relationoptions} Keyword options to the `relation` function include: |
