In-memory SQLite database and Flask: a threading trap

In a Flask development environment people love to use SQLAlchemy with Python’s built-in sqlite backend. When configured with

app.config['SQLALCHEMY_DATABASE_URI'] = "sqlite://"

the database, created via db = SQLAlchemy(app) is stored in memory instead of being persisted to disk. This is a nice feature for development and testing.

With some model classes being defined, at some point the tables should actually be created within the database, which is what the db.create_all() call is usually used for. But where should this be invoked and what difference does it make? Let’s look at two possible places:

  • In the app’s bootstrap code, before app.run() is called (such as in the __init__.py file of your application package.
  • After the development server has been started via app.run(), in a route handler.

What difference does it make? A big one, in certain cases: the call happens in different threads (you can easily convince yourself of this by calling threading.current_thread() in the two places, the result is different).

You might think this should be an implementation detail of how Flask’s development server works under the hood. I agree, and usually this detail is not important to be aware of. However, in case of an in-memory SQLite database this implementation detail makes a significant difference: the two threads see independent databases. And the mean thing is: the same db object represents different databases, depending on the thread from which it is being used.

Example: say you call db.create_all() and pre-populate the database in the main thread, before invoking app.run(). When you then access the database via the same db object (equivalent id()) from within a route handler (which does not run in the main thread), the interaction takes place with a different database, which of course is still empty, the tables are not created. A simple read/query would yield unexpected results.

Again, in other words: although you push around and interact the same db object in your modules, you might still access different databases, depending on from which thread the interaction takes place.

This is not well-documented and leads to nasty errors. I guess most often people run into OperationalError: (OperationalError) no such table:, although they think they have already created the corresponding table.

There are two reasonable solutions:

  • Bootstrap the database from within a route handler. All route handlers run within the same thread, so all route handlers interact with the same database.
  • Do not use the in-memory feature. Then db represents the same physical database across threads.

People have been bitten by this, as can be seen in these StackOverflow threads:

This has been observed with Python 2.7.5 and Flask 0.10.1.

Leave a Reply

Your email address will not be published. Required fields are marked *

Human? Please fill this out: * Time limit is exhausted. Please reload CAPTCHA.

  1. Tagir Magomedov Avatar

    Thanks you. Very helpful.