00:00 In the previous lesson, I showed you the ORM module in SQLAlchemy. In this lesson, I’ll introduce some more complicated relationships to your data. So far, you’ve seen foreign keys used to create one-to-many relationships.
00:16 The key on the book row in the database points to the author, allowing many books to be associated with a single author. This kind of relationship is pretty common, but not the only kind. What if you wanted a book to have multiple authors? That would require a many-to-many relationship, and this is accomplished with a secondary table.
00:38 This new table contains nothing but mappings between objects and is usually made up of only sets of foreign keys. To illustrate this concept, I’m going to introduce some new information into the database: the publisher.
00:54 Each book publisher has a relationship with multiple books and multiple authors. Before showing you a many-to-many relationship, I’m going to take a quick detour into ERDs, or entity relationship diagrams.
These are a way of visualizing the tables and intertable relationships in a database. There are many tools out there that can help you draw these diagrams, most of which will even generate the corresponding SQL
01:33 The book has an ID field, a title, and a foreign key to the author table. The line between the boxes indicates where the book’s foreign key points to. Let’s use an ERD and add the publisher relationship.
On the right-hand side here, you have the same author and book tables as I just showed you, while on the left-hand side, I’ve added the publisher. The
publisher table is where the name of the publisher is stored.
There’s a single row in this table for each publisher. If you were only interested in the publisher-to-book relationship, you could add a foreign key on
book that pointed to the publisher, but then you’d not be able to easily query all the authors that belonged to a publisher.
Instead, I’ve created two secondary tables for the many-to-many relationship. The
author_publisher table represents the relationship between publishers and authors, while the
book_publisher table maps the relationship between the books and the publishers.
02:39 Both these many-to-many tables consist of just the foreign key relationships. Most ORMs will provide additional abstractions with these kinds of relationships to make it easier to go back and forth between your interrelated objects.
02:55 Let’s go see how SQLAlchemy does just that. SQLAlchemy can get a little finicky about having relationships between models in different files. It is possible, but the code needs some extra steps.
03:10 Likewise, it can handle changes to tables, and there are third-party libraries to help you manage with this, but it can get a bit messy. In order to keep the code clearer, as I just want to teach this to you, this example is going to use a new database.
I’m still going to be talking about authors and books, and the objects are going to be similar, but they’re going to be new ones. You won’t be using
models.py in this lesson, but
04:02 You don’t need objects explicitly for the many-to-many relationships, as they’ll described within the ORM object themselves. But you do need to have the tables to be order to store the data. Let me just scroll down here.
The backwards reference for a foreign key uses
backref, while the many-to-many relationship requires two things: the argument named
secondary that specifies the table where the relationship will be stored, and the argument named
back_populates to indicate the name of the back reference on the corresponding object. In this case, it is saying there will be an
.authors attribute on the
Book object, same kind of thing going on here. It has a new publisher’s relationship attribute declared that links the
book_publisher table and defines the name of the relationship for the
Publisher object. Scrolling down a little more.
And finally, this is the
Publisher object. This represents the actual publisher, and so like our other two ORM objects, it has its own key, and this time a name. It then declares a relationship to the author and book objects through the use of the secondary tables defined at the top of the file. There you have it: three ORM objects, a one-to-many relationship between authors and books, and two many-to-many relationships between the publisher and the author and the publisher and the books. Let’s go into the REPL and play with this.
SQLAlchemy will generate the necessary
create statements for your database if you want. You can do that by getting the base class object that is the parent of each of your ORM objects, where everything has been registered when you inherited from them.
Ooh, that was a lot. Let me just back up here. Because everything inherits from
Base, and if you remember, even the table objects took
Base’s metadata as an argument,
Base knows about all the things that need to be created. Using
.create_all() goes through all of the registered objects and issues
create statements for each one. When I scroll back down here, you’ll see the tail end of several
create statements in the debug for the
book_publisher tables. Isn’t that beautiful?
Like adding a book to
king, I add a relationship to the publisher using the
.append() method. This time, though, instead of creating a new object, like I did with
Author, I’m using the actual book that was already created and is listed inside of
king—his first book, well the first book in this database.
10:10 Looking at the publisher’s books, you can now see a scary clown hiding down in the sewer. Running the commit … closes off the transaction, and now when I look at the list of books, the IDs have been set. Let’s add another publisher.
And there you go. Penguin now points to the same book, and
king only has one copy of the book, as it should be. Think back to where all this got started, the CSV file. In the CSV file, this same information was captured through repetition.
11:19 An error in the author’s name would mean changing multiple lines, and there would be no way to distinguish between two authors with the same name. Here, Each thing in the real world gets a single object, whether that’s a person, a book, or a publishing house, and the relationships between them are handled correctly.
As a reference was defined on the
Book object, you can deep dive from
king to his book to that book’s list of publishers. Likewise, you can go the other direction and see the authors associated with a publisher.
11:59 Did you expect that to be empty? Unfortunately, SQL doesn’t know that we want this relationship populated and that it’s cross-referencing something. In your code, you would be responsible for creating both parts of this relationship.
12:22 I’ll leave that to you as an exercise. That’s teacher speak for I was too lazy to do it myself. Wow, you’ve covered a lot. Last up, I’ll summarize the course and point you at some places where you can learn some more.
Become a Member to join the conversation.