============================ Making changes to model code ============================ --------------------------------- Guidelines for writing model code --------------------------------- No length limits on database columns ==================================== Don't put any length limits on your database columns (for example ``sqlalchemy.Column(sqlalchemy.Unicode(30), ...)``). These can cause painful database migrations. Always use ``sqlalchemy.UnicodeText()`` with no length limit as the type for text columns in the database (you can also use ``sqlalchemy.Text()`` if you're sure the column will never receive non-ASCII characters). When necessary validate the lengths of strings in Python code instead. This can be done using `SQLAlchemy validators `_ in model code. View callables for HTML forms should also use Colander schemas to validate user input, in addition to any validation done in the model code, because Colander supports returning per-field errors to the user. ------------------------------------ Creating a database migration script ------------------------------------ If you've made any changes to the database schema (for example: added or removed a SQLAlchemy ORM class, or added, removed or modified a ``sqlalchemy.Column`` on an ORM class) then you need to create a database migration script that can be used to upgrade the production database from the previous to your new schema. We use `Alembic `_ to create and run migration scripts. See the Alembic docs (and look at existing scripts in `h/migrations/versions `_) for details. The ``make db`` command is a wrapper around Alembic. The steps to create a new migration script for h are: 1. Create the revision script by running ``makge db args=revision``, for example: .. code-block:: bash make db args='revision -m "Add the foobar table"' This will create a new script in ``h/migrations/versions/``. 2. Edit the generated script, fill in the ``upgrade()`` and ``downgrade()`` methods. See https://alembic.readthedocs.io/en/latest/ops.html#ops for details. .. note:: Not every migration should have a ``downgrade()`` method. For example if the upgrade removes a max length constraint on a text field, so that values longer than the previous max length can now be entered, then a downgrade that adds the constraint back may not work with data created using the updated schema. 3. Test your ``upgrade()`` function by upgrading your database to the most recent revision. This will run all migration scripts newer than the revision that your db is currently stamped with, which usually means just your new revision script: .. code-block:: bash make db After running this command inspect your database's schema to check that it's as expected, and run h to check that everything is working. .. note:: You should make sure that there's some repesentative data in the relevant columns of the database before testing upgrading and downgrading it. Some migration script crashes will only happen when there's data present. 4. Test your ``downgrade()`` function: .. code-block:: bash make db args='downgrade -1' After running this command inspect your database's schema to check that it's as expected. You can then upgrade it again: .. code-block:: bash make db args='upgrade +1' Batch deletes and updates in migration scripts ============================================== It's important that migration scripts don't lock database tables for too long, so that when the script is run on the production database concurrent database transactions from web requests aren't held up. An SQL ``DELETE`` command acquires a ``FOR UPDATE`` row-level lock on the rows that it selects to delete. An ``UPDATE`` acquires a ``FOR UPDATE`` lock on the selected rows *if the update modifies any columns that have a unique index on them that can be used in a foreign key*. While held this ``FOR UPDATE`` lock prevents any concurrent transactions from modifying or deleting the selected rows. So if your migration script is going to ``DELETE`` or ``UPDATE`` a large number of rows at once and committing that transaction is going to take a long time (longer than 100ms) then you should instead do multiple ``DELETE``\s or ``UPDATE``\s of smaller numbers of rows, committing each as a separate transaction. This will allow concurrent transactions to be sequenced in-between your migration script's transactions. For example, here's some Python code that deletes all the rows that match a query in batches of 25: .. code-block:: python query = query = query.limit(25) while True: if query.count() == 0: break for row in query: session.delete(row) session.commit() Separate data and schema migrations =================================== It's easier for deployment if you do *data migrations* (code that creates, updates or deletes rows) and *schema migrations* (code that modifies the database *schema*, for example adding a new column to a table) in separate migration scripts instead of combining them into one script. If you have a single migration that needs to modify some data and then make a schema change, implement it as two consecutive migration scripts instead. Don't import model classes into migration scripts ================================================= Don't import model classes, for example ``from h.models import Annotation``, in migration scripts. Instead copy and paste the ``Annotation`` class into your migration script. This is because the script needs the schema of the ``Annotation`` class as it was at a particular point in time, which may be different from the schema in ``h.models.Annotation`` when the script is run in the future. The script's copy of the class usually only needs to contain the definitions of the primary key column(s) and any other columns that the script uses, and only needs the name and type attributes of these columns. Other attributes of the columns, columns that the script doesn't use, and methods can usually be left out of the script's copy of the model class. Troubleshooting migration scripts ================================= (sqlite3.OperationalError) near "ALTER" --------------------------------------- SQLite doesn't support ``ALTER TABLE``. To get around this, use `Alembic's batch mode `_. Cannot add a NOT NULL column with default value NULL ---------------------------------------------------- If you're adding a column to the model with ``nullable=False`` then when the database is upgraded it needs to insert values into this column for each of the already existing rows in the table, and it can't just insert ``NULL`` as it normally would. So you need to tell the database what default value to insert here. ``default=`` isn't enough (that's only used when the application is creating data, not when migration scripts are running), you need to add a ``server_default=`` argument to your ``add_column()`` call. See the existing migration scripts for examples.