sqlalchemy made simpler
sqlalchemy is a popular object relational mapper (orm) for python; it provides a tutorial with many details, but might be difficult to follow as a beginner; so i write this article to make that even simpler, hopefully get novices started in a few minutes;
what is sqlalchemy
citing the first sentence in the tutorial:
The SQLAlchemy Object Relational Mapper presents a method of associating user-defined Python classes with database tables, and instances of those classes (objects) with rows in their corresponding tables. It includes a system that transparently synchronizes all changes in state between objects and their related rows, called a unit of work, as well as a system for expressing database queries in terms of the user defined classes and their defined relationships between each other.
so sqlalchemy does 2 jobs:
-
provide a mapping between python elements and database elements;
-
synchronize changes between python elements and database elements;
with sqlalchemy, you no longer need sql for database query; instead, you call python methods, and sqlalchemy does the rest for you;
install
sqlalchemy can be installed via pip:
pip install SQLAlchemy
example 0: a single table
our first example is very simple: create a python class User
, mapped to rdbms
table user
; then create a User
object user0
, and add it as a row into the
rdbms table user
;
the code (with comments) speaks for itself:
#!/usr/bin/env python3
## (boilerplate) imports;
from sqlalchemy import Column
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
## (boilerplate) create database engine; here we create an in-memory sqlite
## database, and echo sql statements issued to the rdbms for easy debugging;
## you can use another rdbms (such as mysql) here;
engine = create_engine('sqlite:///:memory:', echo=True)
## (boilerplate) create declarative base; this is the base class of all classes
## mapped to rdbms tables, when using the *declarative* system; alternatively,
## you can use *classical mappings*, but that wont be covered in this example;
Base = declarative_base()
## declare a class `User` mapped to rdbms table `user`;
class User(Base):
## rdbms table name;
__tablename__ = 'user'
## column spec; in this example, the table has 2 columns: an integer `id`
## which is the primay key, and a string `name`; if multiple columns have
## `primary_key=True`, they form a compound primary key;
id = Column(Integer, primary_key=True)
name = Column(String)
## (optional) repr function; for easy debugging;
def __repr__(self):
return '<User:%r:%r>' % (self.id, self.name)
## (boilerplate) create declared rdbms tables; this issues `CREATE TABLE`
## statements to rdbms;
Base.metadata.create_all(engine)
## now our table is created; then we create a row object;
user0 = User(name='Alice')
print(user0) ## output: <User:None:'Alice'>
## (boilerplate) create session factory; a session factory provides sessions;
Session = sessionmaker(bind=engine)
## create a session; a session is the place to manipulate row objects; you may
## think a session as a cache on the rows (though it isnt really a cache); you
## may think a session as the staged area if you are familiar with git; you may
## also think a thread as the eater, a session as the plate, and row objects as
## the food; the session is not thread-safe; so each thread should get its own
## session for database operation;
session = Session()
## add the object to session (similar to `git add`);
session.add(user0)
## flush the session; this implicitly begins a database transaction and issues
## `INSERT INTO` statements; flush is optional per default, as sqlalchemy will
## auto flush when necessary;
session.flush()
print(user0) ## now the row gets an id; output: <User:1:'Alice'>
## commit the session (similar to `git commit`); this will first flush pending
## operations to database then commit the transaction; if you want a rollback
## instead of a commit, change this line to `session.rollback()`;
session.commit()
print(user0) ## output: <User:1:'Alice'>
## query all the rows in rdbms table `user`;
rows = session.query(User).all()
print(rows) ## output: [<User:1:'Alice'>]
# close the session;
session.close()
example 1: object states
the above example shows how an object works in a session; now it is good to know objects have states, depending on where they live; there are 5 object states, if you are using the latest version of sqlalchemy:
-
transient:
an instance that is not in a session, and is not saved to the database; that is to say, it has no database identity;
-
pending:
when you
session.add()
a transient instance, it becomes pending; it hasnt been flushed to the database yet, but will be when the next flush occurs; -
persistent:
an instance that is present in the session and has a record in the database; you get persistent instances by:
-
flushing, so that the pending instances become persistent, or
-
querying the database for existing instances;
-
-
deleted:
an instance that has been deleted in a flush, but the transaction has not yet completed; objects in this state are essentially in the opposite of pending state; objects in this state have 2 possible fates:
-
if the transaction is committed, the object will move to detached state;
-
if the transaction is rolled back, the object will move back to persistent state;
-
-
detached:
an instance which corresponds, or previously corresponded, to a record in the database, but is not currently in any session; the detached object will contain a database identity marker, however because it is not associated with a session, it is unknown whether or not this database identity actually exists in a target database;
we can inspect an object state using the inspect()
function:
#!/usr/bin/env python3
from sqlalchemy import Column
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy import inspect
engine = create_engine('sqlite:///:memory:', echo=False)
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
def __repr__(self):
return '<User:%r:%r>' % (self.id, self.name)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
user0 = User(name='Alice')
print(
inspect(user0).transient,
inspect(user0).pending,
inspect(user0).persistent,
inspect(user0).deleted,
inspect(user0).detached,
) ## transient;
session.add(user0)
print(
inspect(user0).transient,
inspect(user0).pending,
inspect(user0).persistent,
inspect(user0).deleted,
inspect(user0).detached,
) ## pending;
session.flush()
print(
inspect(user0).transient,
inspect(user0).pending,
inspect(user0).persistent,
inspect(user0).deleted,
inspect(user0).detached,
) ## persistent;
session.delete(user0)
session.flush()
print(
inspect(user0).transient,
inspect(user0).pending,
inspect(user0).persistent,
inspect(user0).deleted,
inspect(user0).detached,
) ## deleted;
session.commit()
print(
inspect(user0).transient,
inspect(user0).pending,
inspect(user0).persistent,
inspect(user0).deleted,
inspect(user0).detached,
) ## detached;
session.close()
should output:
True False False False False
False True False False False
False False True False False
False False False True False
False False False False True
example 2: link two tables at database level
now that we have seen how one table works, we will add another table and build relationship between them; traditionally, to link two rdbms tables, we use a foreign key; this is still true with sqlalchemy;
the following example adds a second table address
, which represents an email
address; this is a one-to-many relationship: a user may have multiple email
addresses, and each email address must have a user; the relationship is
reflected by a foreign key user_id
defined in table address
, pointing to the
column id
in table user
;
here is the code:
#!/usr/bin/env python3
from sqlalchemy import Column
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy import create_engine
from sqlalchemy import inspect
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite:///:memory:', echo=False)
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
def __repr__(self):
return '<User:%r:%r>' % (self.id, self.name)
class Address(Base):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
email = Column(String)
user_id = Column(Integer, ForeignKey('user.id'))
def __repr__(self):
return '<Address:%r:%r:%r>' % (self.id, self.email, self.user_id)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
user0 = User(name='Alice')
session.add(user0)
session.flush() ## necessary for getting an auto-assigned id;
addr0 = Address(email='[email protected]', user_id=user0.id)
session.add(addr0)
session.flush()
addr1 = Address(email='[email protected]', user_id=user0.id)
session.add(addr1)
session.flush()
session.commit()
users = session.query(User).all()
print(users)
addrs = session.query(Address).all()
print(addrs)
matched_rows = session.query(User, Address).filter(User.id == Address.user_id).all()
print(matched_rows)
session.close()
output:
[<User:1:'Alice'>]
[<Address:1:'[email protected]':1>, <Address:2:'[email protected]':1>]
[(<User:1:'Alice'>, <Address:1:'[email protected]':1>), (<User:1:'Alice'>, <Address:2:'[email protected]':1>)]
example 3: link two tables at orm level
the above example shows how to build relationship at database level; however,
sqlalchemy also provides its own relationship model at orm level, using
sqlalchemy.orm.relationship
; the orm-level relationship tracks relationship
between objects in a session, which can help users manipulate objects in a
smarter way; for example, when users add an object to the database, sqlalchemy
also auto adds its linked objects; note that we still need the foreign key when
using the orm-level relationship;
an example demonstrating orm-level relationship:
#!/usr/bin/env python3
from sqlalchemy import Column
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy import create_engine
from sqlalchemy import inspect
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite:///:memory:', echo=False)
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
## relationship: a user has a list of addresses; sqlalchemy is clever
## enough to infer this is a list by inspecting the foreign key;
##
## the address object should keep a back reference in its field `user`;
addrs = relationship('Address', back_populates='user')
def __repr__(self):
return '<User:%r:%r>' % (self.id, self.name)
class Address(Base):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
email = Column(String)
user_id = Column(Integer, ForeignKey('user.id'))
## relationship: an address has a user; sqlalchemy is clever enough to
## infer this is not a list by inspecting the foreign key;
##
## the user object should keep a back reference in its field `addrs`;
user = relationship('User', back_populates='addrs')
def __repr__(self):
return '<Address:%r:%r:%r>' % (self.id, self.email, self.user_id)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
user0 = User(name='Alice')
addr0 = Address(email='[email protected]')
addr1 = Address(email='[email protected]')
## build object relationship at orm level;
user0.addrs.append(addr0)
user0.addrs.append(addr1)
## note that we dont need to explicitly add `addr0` and `addr1`; they are added
## automatically when we add `user0` because they are linked by `user0`;
session.add(user0)
session.commit()
users = session.query(User).all()
print(users)
addrs = session.query(Address).all()
print(addrs)
matched_rows = session.query(User, Address).filter(User.id == Address.user_id).all()
print(matched_rows)
session.close()
output (should be the same as the previous example):
[<User:1:'Alice'>]
[<Address:1:'[email protected]':1>, <Address:2:'[email protected]':1>]
[(<User:1:'Alice'>, <Address:1:'[email protected]':1>), (<User:1:'Alice'>, <Address:2:'[email protected]':1>)]