Many-to-Many Relationships
00:00 In the previous lesson, I showed you the ORM module in SQLAlchemy. In this lesson, I’ll introduce some more complicated relationships to your data. So far, you’ve seen foreign keys used to create one to many relationships.
00:15 The key on the book row in the database points to the author, allowing many books to be associated with a single author. This kind of relationship is pretty common, but not the only kind.
00:27 What if you wanted a book to have multiple authors that would require a many to many relationship and this is accomplished with a secondary table. This new table contains nothing but mappings between objects and is usually made up of only sets of foreign keys.
00:47 To illustrate this concept, I’m going to introduce some new information into the database. The publisher. Each book publisher has a relationship with multiple books and multiple authors.
00:59 Before showing you a many to many relationship, I’m going to take a quick detour into ERDs or entity relationship diagrams. These are a way of visualizing the tables and inter-table relationships in a database.
01:13 There are many tools out there that can help you draw these diagrams, most of which will even generate the corresponding SQL create statements. This diagram illustrates the author and book tables being used in the previous lesson.
01:26 The author has an ID field and the first and last name fields. The book has an ID field, a title, and a foreign key to the author table. The line between the boxes indicates where the book’s foreign key points to.
01:44 Let’s use an ERD and add the publisher relationship.
01:49 On the right-hand side here, you have the same author and book tables as I just showed you. Well, on the left-hand side, I’ve added the publisher. The publisher table is where the name of the publisher is stored.
02:02 There’s a single row in this table for each publisher. If you were only interested in the publisher to book relationship, you could add a foreign key on book that pointed to the publisher, but then you’d not be able to easily query all the authors that belonged to a publisher.
02:19 Instead, I’ve created two secondary tables for the many to many relationship. The author-publisher table represents the relationship between publishers and authors.
02:31 While the book-publisher table maps the relationship between the books and the publishers. Both of these many to many tables consist of just the foreign key relationships.
02:44 Most ORMs will provide additional abstractions with these kinds of relationships to make it easier to go back and forth between your interrelated objects.
02:53 Let’s go see how SQLAlchemy does just that.
02:59 SQLAlchemy can get a little finicky about having relationships between models in different files. It is possible, but the code needs some extra steps. Likewise, it can handle changes to tables and there are third-party libraries to help you manage this, but it can also get a bit messy in order to keep the code clearer.
03:18
In this example, I’m using a new database. I’m still going to be talking about authors and books, but these objects are going to be new ones. You won’t be using models.py
in this lesson, but publish.py
instead.
03:32
And that’s what’s on the screen now, publish.py
. The import lines are similar to before and like before, I’m instantiating a base class to be used for inheritance.
03:43 The first difference here is the manual specification of the author-publisher table. Instead of using an object to abstract it, I’m just directly creating a table.
03:52
Notice the use of the Column
object to specify a column. That’s actually how you used to declare a column in the class mapping before SQLAlchemy got all fancy and added type hint awareness.
04:05 You don’t need objects explicitly for the many to many relationships, as they’ll be described within the ORM objects, but you still have to have the tables to store the relationship data.
04:17 That’s what I’m doing here, creating the author-publisher relationship table. After that one is the book-publisher and I do the same kind of thing. Let me scroll down here a little bit.
04:32
This is the new author class. I copied the version from models.py
, leaving the column declarations the same, including the relationship to the book foreign key.
04:42 What has been added is a new relationship for the many to many table with the publisher. Notice that specifying this kind of relationship is slightly different than with the book.
04:53
In the old version, there is an extra argument. The argument is called secondary
and it specifies that a secondary table is responsible for the relationship.
05:04
The value for secondary
is the author-publisher table specified above. Let me scroll down a bit here. The book object gets the same treatment.
05:15 It also has a relationship through a secondary table. This time it’s the book-publisher table though instead, scrolling down some more.
05:25 And this is the actual publisher object, which represents the actual publisher. Like with the author and book objects, it has its own fields. This time it’s an ID and a name.
05:39
The publisher declares a relationship to author through the use of a secondary table and creates the back relationship named publishers
on the Author object.
05:49 Ditto for the book relationship, and there you have it. Three ORM objects, a one to many relationship between authors and books, and two many to many relationships between publisher and author, and publisher and books.
06:05 Let’s go into the REPL and play with this.
06:13 This should be familiar.
06:21
Remember, I’m using a brand new database here, so the URL is different. I am going to create published_db
06:32
and the key part of that last sentence was Create
. SQLAlchemy will generate the necessary create SQL statements if you want. You can do that by getting the base class object, which is the parent of each of your ORM objects.
06:46
I’ve imported it here from the publish
module. Now I use that to create the tables.
07:00
Wow, that was a lot. Let me just back up here. Because everything inherits from base
, and if you remember, even the table objects took base
’s metadata as an argument, base
knows about all the things that need to be created. Using create_all()
goes through all the registered objects and issues create statements for each of them. When I scroll back down here, you’ll see the tail end of several create
statements in the debug for the book
and book_publisher
tables. Isn’t that beautiful?
07:32 If you’re careful about your object declarations, you never have to write a single line of SQL, not even to get your tables created. As I’ve got tables in place, let’s do some querying.
07:47 Like before, I need a session in the ORM.
08:00
And also like before, I need to import the ORM objects. Here I’ve created an Author
object, and by Calling .add()
on the session it gets written to the database.
08:21
To make it permanent, I need to commit. Like in the previous lesson, I can use the .books
relationship on king
to see his books.
08:35
It’s empty because I haven’t added any yet. Recall that you can add a book using the .append()
method of the .books
relationship.
08:48 And there it is without any ID yet, as I haven’t committed.
08:54 I can add a publisher using the same kinds of steps.
09:06
And like before, the .books
relationship is empty as nothing’s been added yet. Just like adding a book to king
, adding the relationship to the publisher uses the .append()
method.
09:22
This time though, instead of creating a new object inside the call, I’m using the object that was created before and appended to king
. This way all the keys will point correctly once I commit.
09:36 Looking at the publisher’s books, you can now see a scary clown hiding in the sewer. Running commit closes off the transaction,
09:49 Now if I look at the book, it has it’s ID set correctly. Let’s add another publisher.
10:03 It turns out that King’s It was published both by Random House and by its subsidiary, Penguin Books. I’m going to capture that info here by using a many-to-many relationship.
10:19 And finish it off by committing it.
10:25 And there you go. Penguin now points to the same book,
10:34
and king
only has one copy of the book, as it should be. Think back to where all this got started, the CSV file. In the CSV file, this same information was captured through repetition.
10:46 An error in the author’s name would mean changing multiple lines, and there would be no way to distinguish between two authors with the same name. Here, each thing in the real world gets a single object, whether that’s a person, a book, or a publishing house, and the relationships between them are handled correctly.
11:06
As a reference was defined on the Book
object, you can deep dive from king
to his book to that book’s list of publishers.
11:19 Likewise, you can go the other direction and see the authors associated with a publisher.
11:27 Did you expect that to be empty? Unfortunately, SQL doesn’t know that we want this relationship populated and that it’s crossreferencing something. In your code, you would be responsible for creating both parts of this relationship.
11:41
Your best bet would be to have a function on the Publisher
object that took a book to add and would properly tie in the author if they weren’t already there.
11:50 I’ll leave as an exercise. That’s teacher speak for I’m too lazy to do it myself. Wow, you’ve covered a lot. Last up, I’ll summarize the course and point you at some places where you can learn some more.
Become a Member to join the conversation.