summaryrefslogtreecommitdiff
path: root/doc/source/historical/ProjectProposal.txt
blob: e270800dfcad081f87924f61fededa4b4662cb1e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
Evan Rosson

Project
---
SQLAlchemy Schema Migration


Synopsis
---
SQLAlchemy is an excellent object-relational database mapper for Python projects. Currently, it does a fine job of creating a database from scratch, but provides no tool to assist the user in modifying an existing database. This project aims to provide such a tool.


Benefits
---
 Application requirements change; a database schema must be able to change with them. It's possible to write SQL scripts that make the proper modifications without any special tools, but this setup quickly becomes difficult to manage - when we need to apply multiple updates to a database, organize old migration scripts, or have a single application support more than one DBMS, a tool to support database changes becomes necessary. This tool will aid the creation of organizing migration scripts, applying multiple updates or removing updates to revert to an old version, and creating DBMS-independent migration scripts.

 Writing one's schema migration scripts by hand often results in problems when dealing with multiple obsolete database instances - we must figure out what scripts are necessary to bring the database up-to-date. Database versioning tools are helpful for this task; this project will track the version of a particular database to determine what scripts are necessary to update an old schema.


Description
---
 The migration system used by Ruby on Rails has had much success, and for good reason - the system is easy to understand, generally database-independent, as powerful as the application itself, and capable of dealing nicely with a schema with multiple instances of different versions. A migration system similar to that of Rails is a fine place to begin this project.

 Each instance of the schema will have a version associated with it; this version is tracked using a single table with a single row and a single integer column. A set of changes to the database schema will increment the schema's version number; each migration script will be associated with a schema version.

 A migration script will be written by the user, and consist of two functions:
- upgrade(): brings an old database up-to-date, from version n-1 to version n
- downgrade(): reverts an up-to-date database to the previous schema; an 'undo' for upgrade()

 When applying multiple updates to an old schema instance, migration scripts are applied in sequence: when updating a schema to version n from version n-2, two migration scripts are run; n-2 => n-1 => n.

 A command-line tool will create empty migration scripts (empty upgrade()/downgrade() functions), display the SQL that will be generated by a migration script for a particular DBMS, and apply migration scripts to a specified database.

 This project will implement the command-line tool that manages the above functionality. This project will also extend SQLAlchemy with the functions necessary to construct DBMS-independent migration scripts: in particular, column creation/deletion/alteration and the ability to rename existing tables/indexes/columns will be implemented. We'll also need a way to write raw SQL for a specific DBMS/set of DBMSes for situations where our abstraction doesn't fit a script's requirements. The creation/deletion of existing tables and indexes are operations already provided by SQLAlchemy.


 On DBMS support - I intend to support MySQL, Postgres, SQLite, Oracle, and MS-SQL by the end of the project. (Update: I previously omitted support for Oracle and MS-SQL because I don't have access to the full version of each; I wasn't aware Oracle Lite and MS-SQL Express were available for free.) The system will be abstracted in such a way that adding support for other databases will not be any more difficult than adding support for them in SQLAlchemy.


Schedule
---
This project will be my primary activity this summer. Unfortunately, I am in school when things begin, until June 9, but I can still begin the project during that period. I have no other commitments this summer - I can easily make up any lost time.
I will be spending my spare time this summer further developing my online game (discussed below), but this has no deadline and will not interfere with the project proposed here.


I'll begin by familiarizing myself with the internals of SQLAlchemy and creating a detailed plan for the project. This plan will be reviewed by the current SQLAlchemy developers and other potential users, and will be modified based on their feedback. This will be completed no later than May 30, one week after SoC begins.

Development will follow, in this order:
- The database versioning system. This will manage the creation and application of (initially empty) migration scripts. Complete by June 16.
 - Access the database; read/update the schema's version number
 - Apply a single (empty) script to the database
 - Apply a set of (empty) scripts to upgrade/downgrade the database to a specified version; examine all migration scripts and apply all to update the database to the latest version available
- An API for table/column alterations, to make the above system useful. Complete by August 11.
 - Implement an empty API - does nothing at this point, but written in such a way that syntax for each supported DBMS may be added as a module. Completed June 26-30, the mid-project review deadline.
 - Implement/test the above API for a single DBMS (probably Postgres, as I'm familiar with it). Users should be able to test the 'complete' application with this DBMS.
 - Implement the database modification API for other supported databases

All development will have unit tests written where appropriate. Unit testing the SQL generated for each DBMS will be particularly important.

The project will finish with various wrap-up activities, documentation, and some final tests, to be completed by the project deadline.


About me
---
I am a 3rd year BS Computer Science student; Cal Poly, San Luis Obispo, California, USA; currently applying for a Master's degree in CS from the same school. I've taken several classes dealing with databases, though much of what I know on the subject is self-taught. Outside of class, I've developed a browser-based online game, Zeal, at http://zealgame.com ; it has been running for well over a year and gone through many changes. It has taught me firsthand the importance of using appropriate tools and designing one's application well early on (largely through the pain that follows when you don't); I've learned a great many other things from the experience as well.

One recurring problem I've had with this project is dealing with changes to the database schema. I've thought much about how I'd like to see this solved, but hadn't done much to implement it.

I'm now working on another project that will be making use of SQLAlchemy: it fits many of my project's requirements, but lacks a migration tool that will be much needed. This presents an opportunity for me to make my first contribution to open source - I've long been interested in open source software and use it regularly, but haven't contributed to any until now. I'm particularly interested in the application of this tool with the TurboGears framework, as this project was inspired by a suggestion the TurboGears mailing list and I'm working on a project using TurboGears - but there is no reason to couple an SQLAlchemy enhancement with TurboGears; this project may be used by anyone who uses SQLAlchemy.


Further information: 
http://evan.zealgame.com/soc