Making changes to model code¶
Guidelines for writing model code¶
No length limits on database columns¶
Don’t put any length limits on your database columns (for example
sqlalchemy.Column(sqlalchemy.Unicode(30), ...)). These can cause painful
sqlalchemy.UnicodeText() with no length limit as the type for
text columns in the database (you can also use
sqlalchemy.Text() if you’re
sure the column will never receive non-ASCII characters).
When necessary validate the lengths of strings in Python code instead. This can be done using SQLAlchemy validators in model code.
View callables for HTML forms should also use Colander schemas to validate user input, in addition to any validation done in the model code, because Colander supports returning per-field errors to the user.
Creating a database migration script¶
If you’ve made any changes to the database schema (for example: added or
removed a SQLAlchemy ORM class, or added, removed or modified a
sqlalchemy.Column on an ORM class) then you need to create a database
migration script that can be used to upgrade the production database from the
previous to your new schema.
We use Alembic to create and run migration scripts. See the Alembic docs (and look at existing scripts in h/migrations/versions) for details, but the basic steps to create a new migration script for h are:
Create the revision script by running
alembic revision, for example:
alembic -c conf/alembic.ini revision -m "add the foobar table"
This will create a new script in
Edit the generated script, fill in the
See https://alembic.readthedocs.io/en/latest/ops.html#ops for details.
Not every migration should have a
downgrade()method. For example if the upgrade removes a max length constraint on a text field, so that values longer than the previous max length can now be entered, then a downgrade that adds the constraint back may not work with data created using the updated schema.
Stamp your database.
Before running any upgrades or downgrades you need to stamp the database with its current revision, so Alembic knows which migration scripts to run:
alembic -c conf/alembic.ini stamp <revision_id>
<revision_id>should be the revision corresponding to the version of the code that was present when the current database was created. The will usually be the
down_revisionfrom the migration script that you’ve just generated.
upgrade()function by upgrading your database to the most recent revision. This will run all migration scripts newer than the revision that your db is currently stamped with, which usually means just your new revision script:
alembic -c conf/alembic.ini upgrade head
After running this command inspect your database’s schema to check that it’s as expected, and run h to check that everything is working.
You should make sure that there’s some repesentative data in the relevant columns of the database before testing upgrading and downgrading it. Some migration script crashes will only happen when there’s data present.
alembic -c conf/alembic.ini downgrade -1
After running this command inspect your database’s schema to check that it’s as expected. You can then upgrade it again:
alembic -c conf/alembic.ini upgrade +1
Batch deletes and updates in migration scripts¶
It’s important that migration scripts don’t lock database tables for too long, so that when the script is run on the production database concurrent database transactions from web requests aren’t held up.
DELETE command acquires a
FOR UPDATE row-level lock on the
rows that it selects to delete. An
UPDATE acquires a
FOR UPDATE lock on
the selected rows if the update modifies any columns that have a unique index
on them that can be used in a foreign key. While held this
FOR UPDATE lock
prevents any concurrent transactions from modifying or deleting the selected
So if your migration script is going to
UPDATE a large number
of rows at once and committing that transaction is going to take a long time
(longer than 100ms) then you should instead do multiple
UPDATEs of smaller numbers of rows, committing each as a separate
transaction. This will allow concurrent transactions to be sequenced in-between
your migration script’s transactions.
For example, here’s some Python code that deletes all the rows that match a query in batches of 25:
query = <some sqlalchemy query> query = query.limit(25) while True: if query.count() == 0: break for row in query: session.delete(row) session.commit()
Separate data and schema migrations¶
It’s easier for deployment if you do data migrations (code that creates, updates or deletes rows) and schema migrations (code that modifies the database schema, for example adding a new column to a table) in separate migration scripts instead of combining them into one script. If you have a single migration that needs to modify some data and then make a schema change, implement it as two consecutive migration scripts instead.
Don’t import model classes into migration scripts¶
Don’t import model classes, for example
from memex.models import Annotation, in migration scripts.
Instead copy and paste the
Annotation class into your migration script.
This is because the script needs the schema of the
as it was at a particular point in time, which may be different from the
memex.models.Annotation when the script is run in the future.
The script’s copy of the class usually only needs to contain the definitions of the primary key column(s) and any other columns that the script uses, and only needs the name and type attributes of these columns. Other attributes of the columns, columns that the script doesn’t use, and methods can usually be left out of the script’s copy of the model class.
Troubleshooting migration scripts¶
(sqlite3.OperationalError) near “ALTER”¶
SQLite doesn’t support
ALTER TABLE. To get around this, use
Alembic’s batch mode.
Cannot add a NOT NULL column with default value NULL¶
If you’re adding a column to the model with
nullable=False then when the
database is upgraded it needs to insert values into this column for each of
the already existing rows in the table, and it can’t just insert
NULL as it
normally would. So you need to tell the database what default value to insert
default= isn’t enough (that’s only used when the application is creating
data, not when migration scripts are running), you need to add a
server_default= argument to your
See the existing migration scripts for examples.