00:00 In the previous lesson, I showed you the ORM module in SQLAlchemy. In this lesson, I’ll introduce some more complicated relationships to your data. So far, you’ve seen foreign keys used to create one-to-many relationships.
00:16 The key on the book row in the database points to the author, allowing many books to be associated with a single author. This kind of relationship is pretty common, but not the only kind. What if you wanted a book to have multiple authors? That would require a many-to-many relationship, and this is accomplished with a secondary table.
00:38 This new table contains nothing but mappings between objects and is usually made up of only sets of foreign keys. To illustrate this concept, I’m going to introduce some new information into the database: the publisher.
00:54 Each book publisher has a relationship with multiple books and multiple authors. Before showing you a many-to-many relationship, I’m going to take a quick detour into ERDs, or entity relationship diagrams.
These are a way of visualizing the tables and intertable relationships in a database. There are many tools out there that can help you draw these diagrams, most of which will even generate the corresponding SQL
This diagram illustrates the
book tables being used in the previous lesson. The author has an ID field and the first and last name fields.
01:33 The book has an ID field, a title, and a foreign key to the author table. The line between the boxes indicates where the book’s foreign key points to. Let’s use an ERD and add the publisher relationship.
On the right-hand side here, you have the same author and book tables as I just showed you, while on the left-hand side, I’ve added the publisher. The
publisher table is where the name of the publisher is stored.
There’s a single row in this table for each publisher. If you were only interested in the publisher-to-book relationship, you could add a foreign key on
book that pointed to the publisher, but then you’d not be able to easily query all the authors that belonged to a publisher.
Instead, I’ve created two secondary tables for the many-to-many relationship. The
author_publisher table represents the relationship between publishers and authors, while the
book_publisher table maps the relationship between the books and the publishers.
02:39 Both these many-to-many tables consist of just the foreign key relationships. Most ORMs will provide additional abstractions with these kinds of relationships to make it easier to go back and forth between your interrelated objects.
02:55 Let’s go see how SQLAlchemy does just that. SQLAlchemy can get a little finicky about having relationships between models in different files. It is possible, but the code needs some extra steps.
03:10 Likewise, it can handle changes to tables, and there are third-party libraries to help you manage with this, but it can get a bit messy. In order to keep the code clearer, as I just want to teach this to you, this example is going to use a new database.
I’m still going to be talking about authors and books, and the objects are going to be similar, but they’re going to be new ones. You won’t be using
models.py in this lesson, but
And that’s what’s on the screen now,
import lines are similar to before, and like before, I’m instantiating a base class to be used for inheritance.
The first difference is the manual creation of the
book_publisher tables. Instead of using an object to abstract them, I’m just directly creating a table object.
04:02 You don’t need objects explicitly for the many-to-many relationships, as they’ll described within the ORM object themselves. But you do need to have the tables to be order to store the data. Let me just scroll down here.
This is the new version of the
Author class. The column declarations are the same as before, and you still have the back reference to the
book foreign key.
04:30 What has been added is a new relationship for the many-to-many table with the publisher. Notice that specifying this kind of relationship is slightly different.
The backwards reference for a foreign key uses
backref, while the many-to-many relationship requires two things: the argument named
secondary that specifies the table where the relationship will be stored, and the argument named
back_populates to indicate the name of the back reference on the corresponding object. In this case, it is saying there will be an
.authors attribute on the
05:06 I’ll scroll down a bit more.
Book object, same kind of thing going on here. It has a new publisher’s relationship attribute declared that links the
book_publisher table and defines the name of the relationship for the
Publisher object. Scrolling down a little more.
And finally, this is the
Publisher object. This represents the actual publisher, and so like our other two ORM objects, it has its own key, and this time a name. It then declares a relationship to the author and book objects through the use of the secondary tables defined at the top of the file. There you have it: three ORM objects, a one-to-many relationship between authors and books, and two many-to-many relationships between the publisher and the author and the publisher and the books. Let’s go into the REPL and play with this.
06:15 This should be familiar.
And as a reminder, I’m using a brand-new database here, so the URL is specifying a different file. I’m creating
published.db. And the key part of that last sentence was
SQLAlchemy will generate the necessary
create statements for your database if you want. You can do that by getting the base class object that is the parent of each of your ORM objects, where everything has been registered when you inherited from them.
I’ve imported it from the
published module here so I can create the tables.
Ooh, that was a lot. Let me just back up here. Because everything inherits from
Base, and if you remember, even the table objects took
Base’s metadata as an argument,
Base knows about all the things that need to be created. Using
.create_all() goes through all of the registered objects and issues
create statements for each one. When I scroll back down here, you’ll see the tail end of several
create statements in the debug for the
book_publisher tables. Isn’t that beautiful?
07:44 If you’re careful about your object declarations, you never have to write a single line of SQL, not even to get your tables created. Okay, with the tables in place, let’s do some querying.
08:02 Like before, I still need a session.
08:15 And also like before, I need to reference the ORM objects themselves.
I’ve instantiated a new
Author object. I haven’t put it in the database yet. Calling
.add() adds it to the database.
.commit() makes the addition permanent. Like in the previous lesson, I can use the
.books relationship on
king to see his books.
It’s empty because I haven’t added any yet. You’ll probably recall that you can add a book using the
.append() method of the
.books relationship. Let me do that.
09:08 And of course it still doesn’t have IDs, because I haven’t committed yet. If I want to add a publisher, I do something similar as to how I added an author.
And like with the author relationship, the publisher’s
.books relationship is also empty. Nothing’s been added yet.
Like adding a book to
king, I add a relationship to the publisher using the
.append() method. This time, though, instead of creating a new object, like I did with
Author, I’m using the actual book that was already created and is listed inside of
king—his first book, well the first book in this database.
09:58 You get what I mean. SQLAlchemy takes care of all the keys to make sure everything points to each other correctly and there’s only one instance of the book.
10:10 Looking at the publisher’s books, you can now see a scary clown hiding down in the sewer. Running the commit … closes off the transaction, and now when I look at the list of books, the IDs have been set. Let’s add another publisher.
10:41 Turns out that King’s It was published both by Random House and by one of its subsidiaries, Penguin. I’m capturing that info here by using the many-to-many relationship.
10:53 Let’s finish this off by committing it.
And there you go. Penguin now points to the same book, and
king only has one copy of the book, as it should be. Think back to where all this got started, the CSV file. In the CSV file, this same information was captured through repetition.
11:19 An error in the author’s name would mean changing multiple lines, and there would be no way to distinguish between two authors with the same name. Here, Each thing in the real world gets a single object, whether that’s a person, a book, or a publishing house, and the relationships between them are handled correctly.
As a reference was defined on the
Book object, you can deep dive from
king to his book to that book’s list of publishers. Likewise, you can go the other direction and see the authors associated with a publisher.
11:59 Did you expect that to be empty? Unfortunately, SQL doesn’t know that we want this relationship populated and that it’s cross-referencing something. In your code, you would be responsible for creating both parts of this relationship.
Your best bet would be to have a method on the
Publisher object that took a book to add and would properly tie in the author if they weren’t already there.
12:22 I’ll leave that to you as an exercise. That’s teacher speak for I was too lazy to do it myself. Wow, you’ve covered a lot. Last up, I’ll summarize the course and point you at some places where you can learn some more.
Become a Member to join the conversation.