Before we install SQLAlchemy, let's make sure you have the latest version of setuptools, which can make your life a lot easier. At a command prompt, run:
easy_install --help
If you get a list of commands and options, great: setuptools is installed and
working on your system, and you can skip to the next paragraph. If you get an
error message like "command not found", then you'll need to install
setuptools. Download ez_setup.py and run it. (If you're on Linux or OS X,
you may need to run it as root by doing "sudo python ez_setup.py
").
Now that you have setuptools installed, you can install SQLAlchemy by running:
easy_install SQLAlchemy
On Linux and OS X, you may need to run this as root ("sudo easy_install
SQLAlchemy
"). This will automatically connect to the Python Package
Index, find the latest version of SQLAlchemy, download it, and set it
up in your site-packages directory.
You'll also need the latest version of pysqlite installed. Run "easy_install
pysqlite
" (with sudo
if necessary) to fetch it. On Windows, that's all you
need; on Mac OS X or Linux, you'll also need the sqlite3 package installed.
The exact details of installing a package will vary from system to system:
just make sure it's sqlite3 you're installing, and not sqlite or sqlite2.
Once you have everything installed, you're ready to begin!
Let's start with a simple SQLAlchemy program. Copy this into a text editor and
save it as "firststeps.py
":
from sqlalchemy import *
db = create_engine('sqlite', opts={'filename': 'tutorial.db'})
db.echo = False # Try changing this to True and see what happens
users = Table('users', db,
Column('user_id', Integer, primary_key=True),
Column('name', String(40)),
Column('age', Integer),
Column('password', String),
)
users.create()
i = users.insert()
i.execute(name='Mary', age=30, password='secret')
i.execute({'name': 'John', 'age': 42},
{'name': 'Susan', 'age': 57},
{'name': 'Carl', 'age': 33})
s = users.select()
rs = s.execute()
row = rs.fetchone()
print 'Id:', row[0]
print 'Name:', row['name']
print 'Age:', row.age
print 'Password:', row[users.c.password]
for row in rs:
print row.name, 'is', row.age, 'years old'
This code sample shows off a lot of the features of SQLAlchemy. Let's go through it step-by-step.
from sqlalchemy import *
db = create_engine('sqlite', opts={'filename': 'tutorial.db'})
The first step in writing SQLAlchemy code is to open a connection to the
database you'll be using. In SQLAlchemy, this is done by creating an
SQLEngine
object, which knows how to talk to one particular type of database
(SQLite, PostgreSQL, Firebird, MySQL, Oracle...). The SQLEngine
object also
doubles as a connection object. Behind the scenes, it will create a pool of
database connections and re-use them automatically as needed, to keep your
application running quickly. But most of the time, you don't even need to
think about the database connections; just use the SQLEngine
object that
create_engine()
returns, and let SQLAlchemy handle the details for you.
users = Table('users', db,
Column('user_id', Integer, primary_key=True),
Column('name', String(40)),
Column('age', Integer),
Column('password', String),
)
users.create()
This should be pretty self-explanatory: we've just created a users
table
in our database, with four columns. If the users
table already existed, we
could instead have done:
users = Table('users', db, autoload=True)
and SQLAlchemy would have automatically figured out the table's structure from the database.
i = users.insert()
i.execute(name='Mary', age=30, password='secret')
i.execute({'name': 'John', 'age': 42},
{'name': 'Susan', 'age': 57},
{'name': 'Carl', 'age': 33})
SQLAlchemy's SQL construction methods are beautiful. You'll almost never need
to write SQL by hand. Instead, you create an "SQL statement object", build the
SQL query you want, and call its execute()
method. Here we ask for an
INSERT
statement referencing the users
table:
i = users.insert()
Now when we do i.execute()
, SQLAlchemy will generate the appropriate "INSERT
INTO users VALUES (...)
" statement for the values we pass into execute()
.
Notice the two different ways of executing an INSERT
statement. We can
either pass it keyword parameters, to insert a single object:
i.execute(name='Mary', age=30, password='secret')
or else we can pass it multiple dictionaries, to insert multiple objects:
i.execute({'name': 'John', 'age': 42},
{'name': 'Susan', 'age': 57},
{'name': 'Carl', 'age': 33})
If you have any "special" characters (such as semicolons or apostrophes) in your data, they will be automatically quoted for you by the SQLEngine object, so you don't have to worry about quoting. This also means that unless you deliberately bypass SQLAlchemy's quoting mechanisms, SQL-injection attacks are basically impossible.
You may have also noticed that we didn't have to specify all the columns of
the database. Any columns we didn't specify will get filled with NULL
,
except for the primary key, which will automatically get a unique value.
s = users.select()
rs = s.execute()
Like INSERT
statements, SELECT
statements are also done by creating a
statement object and calling its execute()
method. This particular statement
is the basic "SELECT * FROM users
" with no WHERE
clause. Later on we'll
see how to do WHERE
clauses, or only return data from certain columns, or
JOIN
two (or more) tables together.
Calling execute()
on a SELECT
statement object will return a result set,
which has fetchone()
and fetchall()
methods. As you'd expect, fetchone()
returns a single row, while fetchall()
returns a list of rows. The rows
returned aren't simple tuples or dictionaries, but intelligent row objects, as
can be seen below:
row = rs.fetchone()
print 'Id:', row[0]
print 'Name:', row['name']
print 'Age:', row.age
print 'Password:', row[users.c.password]
Here we see some of the various ways you can access the data in a row object.
First, you can pretend it's a tuple and access its columns by position.
SQLAlchemy guarantees that the column order returned from a "SELECT * FROM
(table)
" statement will be the same as the order in which the columns were
declared in that table, so here we know that row[0]
will be the user_id
column. We can also access the row as if it were a dictionary (row['name']
).
Next, my favorite: SQLAlchemy lets us access the columns as if they were
attributes of the row object. (Some simple __getattr__()
magic behind the
scenes makes this work). And finally, we can even use the actual Column
objects themselves as keys to lookup results from a row. You probably won't
use this very often, but it can be extremely useful in some circumstances.
for row in rs:
print row.name, 'is', row.age, 'years old'
Finally, we see that we can also iterate through the result set via a simple
for
loop. This is especially useful when you expect your SELECT
query to
return a huge result set that would be too large to load into memory: the
for
loop will only fetch one row at a time from the database.
Now let's take a little time to examine some of the various ways in which we
can select rows from our database. There are lots of conditions you might want
to put in the WHERE
clause of a SELECT
statement, and SQLAlchemy makes
most of those easy.
Copy the following code into "selectdemo.py
":
from sqlalchemy import *
# Let's re-use the same database as before
db = create_engine('sqlite', opts={'filename': 'tutorial.db'})
db.echo = True # We want to see the SQL we're creating
# The users table already exists, so no need to redefine it. Just
# load it from the database using the "autoload" feature.
users = Table('users', db, autoload=True)
def run(stmt):
rs = stmt.execute()
for row in rs:
print row
# Most WHERE clauses can be constructed via normal comparisons
s = users.select(users.c.name == 'John')
run(s)
s = users.select(users.c.age < 40)
run(s)
# Python keywords like "and", "or", and "not" can't be overloaded, so
# SQLAlchemy uses functions instead
s = users.select(and_(users.c.age < 40, users.c.name != 'Mary'))
run(s)
s = users.select(or_(users.c.age < 40, users.c.name != 'Mary'))
run(s)
s = users.select(not_(users.c.name == 'Susan'))
run(s)
# Or you could use &, | and ~ -- but watch out for priority!
s = users.select((users.c.age < 40) & (users.c.name != 'Mary'))
run(s)
s = users.select((users.c.age < 40) | (users.c.name != 'Mary'))
run(s)
s = users.select(~(users.c.name == 'Susan'))
run(s)
# There's other functions too, such as "like", "startswith", "endswith"
s = users.select(users.c.name.startswith('M'))
run(s)
s = users.select(users.c.name.like('%a%'))
run(s)
s = users.select(users.c.name.endswith('n'))
run(s)
# The "in" and "between" operations are also available
s = users.select(users.c.name.in_('Mary', 'Susan'))
run(s)
s = users.select(users.c.age.between(30,39)) # This might fail
run(s)
s = users.select(between_(users.c.age, 30, 39)) # This will work
run(s)
# If you want to call an SQL function, use "func"
s = users.select(func.substr(users.c.name, 2, 1) == 'a')
run(s)
# You don't have to call select() on a table; it's got a bare form
s = select([users], users.c.name != 'Carl')
run(s)
# This can be handy for things like count()
s = select([func.count(users.c.user_id)])
run(s)
Most of this should be pretty self-explanatory, and should give you an idea of what's possible. A few notes:
# Or you could use &, | and ~ -- but watch out for priority!
s = users.select((users.c.age < 40) & (users.c.name != 'Mary'))
run(s)
s = users.select((users.c.age < 40) | (users.c.name != 'Mary'))
run(s)
s = users.select(~(users.c.name == 'Susan'))
run(s)
In Python, the &
(and), |
(or), and ~
(not) operators have a higher
priority than comparison operators like ==
and !=
. So if you want to use
the &
, |
and ~
operators, you have to be careful to wrap the other
clauses in parentheses. If you forget the parentheses, you'll be surprised by
the results:
s = users.select(users.c.age < 40 & users.c.name != 'Mary')
will be interpreted as:
s = users.select(users.c.age < (40 & users.c.name) != 'Mary')
which will almost certainly not return the results you were expecting.
Another note:
s = users.select(users.c.age.between(30,39)) # This might fail
run(s)
s = users.select(between_(users.c.age, 30, 39)) # This will work
run(s)
The latest SQLAlchemy release as I write this is 0.1.6, in which the first
statement fails because the between()
function is not available on table
columns. This is an oversight, and the next SQLAlchemy release will fix this.
If you're running an SVN checkout of SQLAlchemy, this bugfix is also
available. But if you're running the 0.1.6 release or earlier, you won't be
able to use between()
on table columns, and you'll need to use the second
form shown above. (With the final underscore at the end of between_()
.)
Finally:
# This can be handy for things like count()
s = select([func.count(users.c.user_id)])
run(s)
There doesn't seem to be a way to do COUNT(*)
-- but doing
COUNT(some_column) will return the same results under most circumstances.
At this point, you're probably wondering about using multiple tables in a
single select()
statement. Wonder no more. Copy the following into
"joindemo.py
":
from sqlalchemy import *
db = create_engine('sqlite', opts={'filename': 'joindemo.db'})
db.echo = True
users = Table('users', db,
Column('user_id', Integer, primary_key=True),
Column('name', String(40)),
Column('age', Integer),
)
users.create()
emails = Table('emails', db,
Column('email_id', Integer, primary_key=True),
Column('address', String),
Column('user_id', Integer, ForeignKey('users.user_id')),
)
emails.create()
i = users.insert()
i.execute(
{'name': 'Mary', 'age': 30},
{'name': 'John', 'age': 42},
{'name': 'Susan', 'age': 57},
{'name': 'Carl', 'age': 33}
)
i = emails.insert()
i.execute(
# There's a better way to do this, but we haven't gotten there yet
{'address': 'mary@example.com', 'user_id': 1},
{'address': 'john@nowhere.net', 'user_id': 2},
{'address': 'john@example.org', 'user_id': 2},
{'address': 'carl@nospam.net', 'user_id': 4},
)
def run(stmt):
rs = stmt.execute()
for row in rs:
print row
# This will return more results than you are probably expecting.
s = select([users, emails])
run(s)
# The reason is because you specified no WHERE clause, so a full join was
# performed, which returns every possible combination of records from
# tables A and B. With an appropriate WHERE clause, you'll get the
# restricted record set you really wanted.
s = select([users, emails], emails.c.user_id == users.c.user_id)
run(s)
# If you're interested in only a few columns, then specify them explicitly
s = select([users.c.name, emails.c.address],
emails.c.user_id == users.c.user_id)
run(s)
# There are also "smart" join objects that can figure out the correct join
# conditions based on the tables' foreign keys
s = join(users, emails).select()
run(s)
# If you want all the users, whether or not they have an email address,
# then you want an "outer" join.
s = outerjoin(users, emails).select()
run(s)
# Order of outer joins is important! Default is a "left outer join", which
# means "all records from the left-hand table, plus their corresponding
# values from the right-hand table, if any". Notice how this time, Susan's
# name will *not* appear in the results.
s = outerjoin(emails, users).select()
run(s)
That's enough for a taste. More information can be found in the SQL Construction section of the SQLAlchemy documentation. Now let's move on to the really interesting part: mapping your data objects to SQL database rows.
Now for the really interesting part: mapping your objects onto the database.
In other object-relational mappers such as SQLObject, the table definition
also doubles as the class whose instances are rows of data from the table.
SQLAlchemy, on the other hand, makes a strict distinction between the table
definition and the data class. You first create the table definition, then
create an (empty) class definition that will hold your data objects, and then
create a mapper that will map that class onto the database. It's perhaps
easier to show how this works than to explain it. Copy the following into
"mapper1.py
":
from sqlalchemy import *
db = create_engine('sqlite', opts={'filename': 'joindemo.db'})
db.echo = True
users = Table('users', db, autoload=True)
emails = Table('emails', db, autoload=True)
# These are the empty classes that will become our data classes
class User(object):
pass
class Email(object):
pass
usermapper = mapper(User, users)
emailmapper = mapper(Email, emails)
mary = usermapper.get_by(name='Mary')
mary.age += 1
objectstore.commit()
Let's break this down piece-by-piece to see what's going on here.
from sqlalchemy import *
db = create_engine('sqlite', opts={'filename': 'joindemo.db'})
db.echo = True
Here we're using the same SQLite database that we created in the join demo, which contains users and email addresses. Unless you've deleted that file, the data (four users and four email addresses) should still be there as well. We set echo to True so that the SQL will be printed out at each step.
users = Table('users', db, autoload=True)
emails = Table('emails', db, autoload=True)
Because the users
and emails
tables are already in the database, we don't
have to specify them again; we can just let SQLAlchemy fetch their definitions
from the database.
# These are the empty classes that will become our data classes
class User(object):
pass
class Email(object):
pass
Note that your data classes must be new-style classes (e.g., derived from
the base class object
). If they aren't, SQLAlchemy will raise an
ArgumentError
exception when you try to create the mapper. If you don't know
what the difference between new-style classes and old-style classes is, don't
worry about it; just get in the habit of deriving all your classes either from
object
or from another base class that descends from object
. Most of
Python's object-oriented features work much better on classes that ultimately
derive from object
, and SQLAlchemy makes heavy use of those features.
Moving on:
usermapper = mapper(User, users)
emailmapper = mapper(Email, emails)
This is where all the magic happens. The mapper()
function takes a minimum
of two parameters: first the data class to modify, and then the table object
onto which that data class should be mapped. The data class will have
attributes automatically added to it that correspond to the columns of your
database table. Thus, the User class now has User.user_id
, User.name
,
User.age
, and User.password
. The Email class now has Email.email_id
,
Email.address
, and Email.user_id
.
mary = usermapper.get_by(name='Mary')
In addition to creating attributes on the data classes, the mapper()
function call also returns a mapper object, which has some features of its
own. For example, it has select_by()
and get_by()
methods, which are
convenience methods for doing a SELECT
statement. select_by()
will return
a list of results, while get_by()
will return exactly one result.
mary.age += 1
Now that we have an instance of the data class (in the object "mary
"), we
can manipulate its attributes just like a normal object. SQLAlchemy will keep
track of the changes we make, but won't actually send them to the database
right away. To send our changes to the database, we need to ...
objectstore.commit()
... call the objectstore.commit()
method. This will take all the changes
we've made to our data objects, and "flush" those changes out to the database.
If we've made multiple changes, some of which depend on other changes (e.g.,
adding a new user and several email addresses for that user), SQLAlchemy is
smart enough to write the SQL statements in the correct order.
Note that the objectstore.commit()
method should not be confused with a
database transaction. Despite the "commit()
" name, this function call does
not start a transaction, nor will it commit a transaction if you've already
opened one. In a future release of SQLAlchemy, this method's name will be
changed to "flush()
" instead of "commit()
" to reduce confusion. Meanwhile,
you just need to remember that objectstore.commit()
has nothing to do with
transactions.
If you want to use a transaction, by the way, you should call the begin()
method on your SQLEngine
object. In our case, we called that object db
, so
to start a transaction, we'd call db.begin()
. Then db.rollback()
and
db.commit()
would either rollback or commit the transaction, respectively.
There's more information on transactions in the Database Engines section
of the SQLAlchemy documentation. Right now, let's move on to the next section
of the tutorial, where we'll examine more features of SQLAlchemy's mapper()
function.
Remember how, in the joindemo.py
tutorial, one of the comments said:
# There's a better way to do this, but we haven't gotten there yet
Well, now's the time to look at the better way to handle one-to-many and
many-to-many relations. Copy the following into "mapper2.py
":
from sqlalchemy import *
db = create_engine('sqlite', opts={'filename': 'joindemo.db'})
db.echo = True
users = Table('users', db, autoload=True)
emails = Table('emails', db, autoload=True)
# Let's give our User and Email classes a little more smarts
class User(object):
def __init__(self, name=None, age=None, password=None):
self.name = name
self.age = age
self.password = password
def __repr__(self):
return self.name
class Email(object):
def __init__(self, address=None):
self.address = address
def __repr__(self):
return self.address
# Here we look at several alternate ways to do the same thing. Try
# running this program multiple times, enabling a different one of
# these code blocks each time.
if False:
# We create the Email mapper first...
Email.mapper = mapper(Email, emails)
# ... so that we can use it in the User mapper
User.mapper = mapper(User, users, properties={
'emails': relation(Email.mapper),
})
mary = User.mapper.get_by(name='Mary')
print mary.emails
elif False:
# What if we also want a "user" property on the Email class?
# Create the Email mapper first
Email.mapper = mapper(Email, emails)
# Then reference it in the User mapper
User.mapper = mapper(User, users, properties={
'emails': relation(Email.mapper),
})
# And now add a property to the original Email mapper
Email.mapper.add_property('user', relation(User.mapper))
john = User.mapper.get_by(name='John')
print john.emails
carl_address = Email.mapper.get_by(address='carl@nospam.net')
print carl_address.user
elif False:
# There's a handy "backref" feature that will do the above for you
Email.mapper = mapper(Email, emails)
User.mapper = mapper(User, users, properties={
'emails': relation(Email.mapper, backref='user'),
})
# No need to call add_property(), it's already been done
john = User.mapper.get_by(name='John')
print john.emails
carl_address = Email.mapper.get_by(address='carl@nospam.net')
print carl_address.user
elif False:
# Order doesn't actually matter; relation() is quite clever
User.mapper = mapper(User, users)
Email.mapper = mapper(Email, emails, properties={
'user': relation(User.mapper, backref='emails'),
})
john = User.mapper.get_by(name='John')
print john.emails
carl_address = Email.mapper.get_by(address='carl@nospam.net')
print carl_address.user
elif True:
# Finally, let's demonstrate some other clever features
Email.mapper = mapper(Email, emails)
User.mapper = mapper(User, users, properties={
'emails': relation(Email.mapper, backref='user'),
})
# ForeignKey references are automatically mapped to the appropriate
# data class
harry = User(name='Harry', age=47)
em1 = Email('harry@nowhere.com')
em2 = Email('harry@example.org')
em1.user = harry
harry.emails.append(em2)
print em2.user
print harry.emails
# If a relation has been defined, then get_by and select_by calls
# can do the correct joins automatically
print User.mapper.get_by(address='mary@example.com')
print Email.mapper.select_by(age=42)
# This will only work if the column you're looking for is *not*
# present in the "original" class, but is present in one of its
# relations. For example, the following does *not* do a join to the
# User table, but gets the user_id value from the Email table. Notice
# the difference in the SQL that's printed.
print Email.mapper.select_by(user_id=2)
Most of this is pretty self-explanatory as well. A couple items are worth noting:
Email.mapper = mapper(Email, emails)
Instead of using names like usermapper
and emailmapper
as we did in
mapper1.py
, this time we follow SQLAlchemy convention and store the mapper
object in the data class, as User.mapper
and Email.mapper
respectively.
That way every data object will have a reference to the mapper object that
controls it. It doesn't make much difference in this example, but it can make
a lot of difference in certain circumstances.
There's another helper function that I didn't show in the example above: the
assign_mapper()
function. It creates the mapper
attribute as per
convention, and also takes the get_by()
, select_by()
, and other
mapper-based functions and makes them attributes of the data class. This lets
you do:
assign_mapper(Email, emails)
print Email.select_by(name='John')
# Equivalent to:
print Email.mapper.select_by(name='John')
assign_mapper()
takes all the same parameters that mapper()
takes, so our
mapper-assignment code could have looked like:
assign_mapper(Email, emails)
assign_mapper(User, users, properties={
'emails': relation(Email.mapper, backref='user'),
})
There's one more item to cover. We've looked at one-to-many relationships, but
we also need to look at many-to-many relationships. As you probably know
already, many-to-many relationships in databases are handled by a third table
that holds information about the relation. E.g., if you have an "articles
"
table and a "keywords
" table, and you want to be able to associate keywords
with articles, you'd need a many-to-many relationship. One-to-many wouldn't
work, because one article might need to be tagged with several different
keywords, and the same keyword might be used to tag several articles; so this
is a classic many-to-many relationship. Thus, you'd use a third table with
just two columns, "article_id
" and "keyword_id
", to keep track of the
associations. By convention, such a table is usually named with the names of
the two tables it references, separated by an underscore. Thus, the table
structure in the following example (call it "manytomany.py
"):
from sqlalchemy import *
db = create_engine('sqlite', {'filename': 'keywords.db'})
db.echo = True
articles = Table('articles', db,
Column('article_id', Integer, primary_key = True),
Column('headline', String(150)),
Column('body', String),
)
articles.create()
keywords = Table('keywords', db,
Column('keyword_id', Integer, primary_key = True),
Column('keyword_name', String(50)),
)
keywords.create()
association = Table('articles_keywords', db,
Column('keyword_id', Integer, ForeignKey('articles.article_id')),
Column('article_id', Integer, ForeignKey('keywords.keyword_id')),
)
association.create()
class Article(object):
def __init__(self, headline=None, body=None):
self.headline = headline
self.body = body
def __repr__(self):
return 'Article %d: "%s"' % (self.article_id, self.headline)
class Keyword(object):
def __init__(self, name=None):
self.keyword_name = name
def __repr__(self):
return self.keyword_name
# To create a many-to-many relation, specify the association table as
# the second parameter to mapper()
Article.mapper = mapper(Article, articles)
Keyword.mapper = mapper(Keyword, keywords, properties = {
'articles': relation(Article.mapper, association, backref='keywords'),
})
a1 = Article(headline="Python is cool!", body="(to be written)")
a2 = Article(headline="SQLAlchemy Tutorial", body="You're reading it")
k_tutorial = Keyword('tutorial')
k_cool = Keyword('cool')
k_unfinished = Keyword('unfinished')
a1.keywords.append(k_unfinished)
k_cool.articles.append(a1)
k_cool.articles.append(a2)
# Or:
k_cool.articles = [a1, a2] # This works as well!
a2.keywords.append(k_tutorial)
# Now we write all this out to the database in one single step, and
# SQLAlchemy automatically figures out the correct order for the SQL
# statements.
objectstore.commit()
print a1, a1.keywords
print a2, a2.keywords
print k_tutorial, k_tutorial.articles
print k_cool, k_cool.articles
print k_unfinished, k_unfinished.articles
Again, the code pretty much speaks for itself. If you pass the association
table as the second parameter to mapping()
, and SQLAlchemy sees that it's in
the correct format (it has only two columns, each of which is a foreign key to
one of the tables involved in the relation), it will automatically set up a
many-to-many relationship for you.
If you want to hold more data in the association object (for example, maybe you want to record the exact date and time when the article was tagged with any given keyword), it's slightly more complicated. The SQLAlchemy documentation has a good explanation of the process at the bottom of the Data Mapping page. That whole page, in fact, is well worth reading, since there are several other features of data mapping that I glossed over or left out entirely, in order to keep the size of this tutorial manageable. And once you've grasped those concepts, you can move on to the Advanced Data Mapping section, which covers subjects like mapping a table onto itself (useful for tracking manager/subordinate relationships in an Employees table, for example).
Hopefully this has given you a taste of what using SQLAlchemy feels like. The examples in this tutorial should be enough to give you a head start in writing your own applications. When you need more advanced features beyond what we've covered here, check out the extensive SQLAlchemy documentation. Just about everything you'll need is there; and if you still don't find what you're looking for, join the SQLAlchemy mailing list and ask!