In this part we’re going to set up a Postgres database to store the results of our word counts as well as SQLAlchemy, an Object Relational Mapper, and Alembic to handle database migrations.
- 03/22/2016: Upgraded to Python version 3.5.1 as well as the latest versions of Psycopg2, Flask-SQLAlchemy, and Flask-Migrate. See below for details.
- 02/22/2015: Added Python 3 support.
Remember: Here’s what we’re building – A Flask app that calculates word-frequency pairs based on the text from a given URL.
- Part One: Set up a local development environment and then deploy both a staging and a production environment on Heroku.
- Part Two: Set up a PostgreSQL database along with SQLAlchemy and Alembic to handle migrations. (current)
- Part Three: Add in the back-end logic to scrape and then process the word counts from a webpage using the requests, BeautifulSoup, and Natural Language Toolkit (NLTK) libraries.
- Part Four: Implement a Redis task queue to handle the text processing.
- Part Five: Set up Angular on the front-end to continuously poll the back-end to see if the request is done processing.
- Part Six: Push to the staging server on Heroku – setting up Redis and detailing how to run two processes (web and worker) on a single Dyno.
- Part Seven: Update the front-end to make it more user-friendly.
- Part Eight: Add the D3 library into the mix to graph a frequency distribution and histogram.
Need the code? Grab it from the repo.
Tools used in this part:
- PostgreSQL (9.4)
- Psycopg2 (2.6.1) – a Python adapter for Postgres
- Flask-SQLAlchemy (2.1) – Flask extension that provides SQLAlchemy support
- Flask-Migrate (1.8.0) – extension that supports SQLAlchemy database migrations via Alembic
To get started, install Postgres on your local computer, if you don’t have it already. Since Heroku uses Postgres, it will be good for us to develop locally on the same database. If you don’t have Postgres installed, Postgres.app is an easy way to get up and running for Mac OS X users. Consult the download page for more info.
Once you have Postgres installed and running, create a database called
wordcount_dev to use as our local development database:
1 2 3 4
In order to use our newly created database within the Flask app we to need to install a few things:
If you’re on OS X and having trouble installing psycopg2 check out this Stack Overflow article.
SQLALCHEMY_DATABASE_URI field to the
Config() class in your config.py file to set your app to use the newly created database in development (local), staging, and production:
1 2 3 4 5
Your config.py file should now look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Now when our config is loaded into our app the appropriate database will be connected to it as well.
Similar to how we added an environment variable in the last post, we are going to add a
DATABASE_URL variable. Run this in the terminal:
And then add that line into your .env file.
In your app.py file import SQLAlchemy and connect to the database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Set up a basic model by adding a models.py file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Here we created a table to store the results of the word counts.
We first import the database connection that we created in our app.py file as well as JSON from SQLAlchemy’s PostgreSQL dialects. JSON columns are fairly new to Postgres and are not available in every database supported by SQLAlchemy so we need to import it specifically.
Next we created a
Result() class and assigned it a table name of
results. We then set the attributes that we want to store for a result-
idof the result we stored
urlthat we counted the words from
- a full list of words that we counted
- a list of words that we counted minus stop words (more on this later)
We then created an
__init__() method that will run the first time we create a new result and, finally, a
__repr__() method to represent the object when we query for it.
Create a new file called manage.py:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
In order to use Flask-Migrate we imported
Manager as well as
MigrateCommand to our manage.py file. We also imported
db so we have access to them from within the script.
First, we set our config to get our environment – based on the environment variable – created a migrate instance, with
db as the arguments, and set up a
manager command to initialize a
Manager instance for our app. Finally, we added the
db command to the
manager so that we can run the migrations from the command line.
In order to run the migrations initialize Alembic:
1 2 3 4 5 6 7 8 9
After you run the database initialization you will see a new folder called “migrations” in the project. This holds the setup necessary for Alembic to run migrations against the project. Inside of “migrations” you will see that it has a folder called “versions”, which will contain the migration scripts as they are created.
Let’s create our first migration by running the
1 2 3 4 5 6
Now you’ll notice in your “versions” folder there is a migration file. This file is auto-generated by Alembic based on the model. You could generate (or edit) this file yourself; however, for most cases the auto-generated file will do.
Now we’ll apply the upgrades to the database using the
db upgrade command:
1 2 3 4
The database is now ready for us to use in our app:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Finally, let’s apply the migrations to the databases on Heroku. First, though, we need to add the details of the staging and production databases to the config.py file.
To check if we have a database set up on the staging server run:
1 2 3
Make sure to replace
wordcount-stagewith the name of your staging app.
Since we don’t see a database environment variable, we need to add the Postgres addon to the staging server. To do so, run the following command:
1 2 3 4 5 6 7 8
hobby-devis the free tier of the Heroku Postgres addon.
Now when we run
heroku config --app wordcount-stage again we should see the connection settings for the database:
1 2 3
Next we need to commit the changes that you’ve made to git and push to your staging server:
Run the migrations that we created to migrate our staging database by using the
heroku run command:
1 2 3 4 5
Notice how we only ran the
upgrade, not the
migratecommands like before. We already have our migration file set up and ready to go; we just need to apply it against the Heroku database.
Let’s now do the same for production.
- Set up a database for your production app on Heroku, just like you did for staging:
heroku addons:create heroku-postgresql:hobby-dev --app wordcount-pro
- Push your changes to your production site:
git push pro masterNotice how you don’t have to make any changes to the config file – it’s setting the database based on the newly created
- Apply the migrations:
heroku run python manage.py db upgrade --app wordcount-pro
Now both our staging and production sites have their databases set up and are migrated – and ready to go!
When you apply a new migration to the production database, there could be down time. If this is an issue, you can set up database replication by adding a “follower” (commonly known as a slave) database. For more on this, check out the official Heroku documentation.
That’s it for part 2. Comment below with questions.
In Part 3 we’re going to build the word counting functionality and have it sent to a task queue to deal with the longer running word count processing.
See you next time. Cheers!
This is a collaboration piece between Cam Linke, co-founder of Startup Edmonton, and the folks at Real Python