SQLAlchemy: Core Text
If you’d like a laugh, then check out xkcd: Exploits of a Mom.
00:00 In the previous lesson, I introduced you to writing SQL to manipulate a relational database. In this lesson, I’ll show you how to use SQLAlchemy to accomplish the same things in your Python code.
00:13 SQLAlchemy is a popular third-party library for accessing and managing databases. It is made up of two parts: Core and the ORM. The core is a Python abstraction of SQL, where the ORM uses object-oriented modeling to represent database rows.
00:31 One of the advantages of using a library like SQLAlchemy is it abstracts away the connection mechanisms. You initialize the SQLAlchemy engine the same way, regardless of what kind of database you’re connecting to.
00:43 This means you can use any number of the supported databases in the same fashion. This includes SQLite, Postgresql, MySQL, Oracle, MS-SQL server, and many more.
00:58 Let’s go into the REPL and turn some lead into gold. The first thing you have to do is connect to your database. SQLAlchemy does this through an abstraction called an engine.
01:10 Let me import in the engines factory … and now I’ll create an engine.
01:31
The URL-style parameter here indicates what I’m connecting to. Instead of http
, you specify the type of database. For me, that’s sqlite
.
01:41
The path part indicates where the database is. For a database that used a server, this would have a hostname and a port. Because SQLite uses files, this is a file reference. I’m connecting to the books.db
file in the data/
subdirectory. The engine factory uses lazy loading, so the connection hasn’t been established yet.
02:03
The echo
parameter turns on SQL echoing. Everything I do through this connection will print out the SQL equivalent. It’ll be a bit noisy, but for our purposes, you’ll be able to see exactly what is going on.
02:17 I mentioned in the first lesson that SQLAlchemy is going through an adaptation phase heading for the 2.0 release. The 1.4 version supports both the old dialect and the new dialect of SQLAlchemy.
02:29
The future=True
argument here says to enforce the use of the new mechanism. That way, when 2.0 is released, your code will work with it with fewer changes. Let’s use this engine to connect.
02:45
Everything from here on will be done through my conn
object. Now I’ll write a query. For this lesson, I’ll be solely using the text
query object.
02:55 This allows you to write raw SQL directly to the engine.
03:07
There you go. The .execute()
method on the connection object runs a query. The text
object takes a string as an argument containing raw SQL.
03:16
And in this case, I’m running a select
statement. The method returns a result
object. The muted logger info here is because I initialized the engine with echo=True
.
03:27
You can see that it created a transaction—that’s the BEGIN
—then did a select
. The database was able to run the query in 0.00024 seconds. Pretty snappy. Let’s look at the results.
03:46
result.all()
returns a list of tuples, and the database I’m using was populated with two authors in the previous lesson, good old Isaac and Pearl. All right, let’s make some changes now. To do an insert, I’ll need to execute a query.
04:14
Like before, I’m using a text object, so I can put in raw SQL. There’s something new here though. Notice that in the VALUES
clause of the INSERT
, I’m using :fn
and :ln
. The colon indicates that these are placeholder values.
04:32
SQLAlchemy sees these and knows to populate them with the parameters that you passed to text
(). This next line is those parameters.
04:49
I’m passing in a list of dictionaries where each dictionary has an "fn"
and "ln"
key. These get mapped to the :fn
and :ln
placeholders in the query.
05:00
By passing in a list of dictionaries, I am able to insert more than one author at a time. Let me just close the call … and you see a few things. First off, you get the INSERT
debug info showing "Tom"
and "Stephen"
—I speak as if they’re close friends—being inserted. Next, because I didn’t capture the result of the .execute()
, The REPL shows a result
object.
05:25 If I were writing real code, I’d check to make sure nothing went wrong here. Let’s query for our authors again.
05:39
And there you go. There are now four authors in the database. Remember that BEGIN
statement several queries ago? I’m still inside of the same transaction.
05:49
These all get grouped together as part of the database. Calling .commit()
closes the transaction off and makes all the changes permanent. So what are transactions good for? Well, if you get partway through a group of queries, you can undo all of them by rolling back the transaction. To demonstrate that, I’ll start by inserting another author.
06:32
Same as you’ve seen before … And now there’s 'Not', 'Anauthor'
cluttering up my data. Calling .rollback()
instead of .commit()
… and everything has been undone to the point of the last commit.
06:55 I can show that by running the query again …
07:05
and 'Not', 'Anauthor'
is gone. Handy that, you can undo. Let’s play with some query results some more.
07:32
but this time, instead of calling .all()
, I’ll iterate over the result
object. Each item in the result
object iterator corresponds to a row in the database, or it does this time because that was the kind of select
I did.
07:47
I can access the columns in a result by using the columns’ names: row.last_name
and row.first_name
, for example. Note that the result
object is temporal. Accessing the items in it consumes them. That’s why calling .all()
now shows an empty list.
08:08 This might take a little getting used to, but it’s actually a good thing because it means you can iterate partway through a result, do something in your code, and then go back and not have to remember where you were. This becomes particularly important when you start to get into things like pagination and larger sets of data.
08:35 Although I like to think of the results of a SQL query as being a row in the database, this is a faulty mental abstraction. Oftentimes that is exactly what you’re getting back, but really you’re creating a dataset with specific names.
08:50 If you use a join, you’ll end up with parts of data from different tables.
09:06 Here, I’ve been specific about the fields I was interested in, so when I iterate through the results, I can treat the columns coming back as parts of a tuple.
09:30 I can also use indexes into those same tuples to get at their values. Okay, one more one more time.
09:52
The .mappings()
method on the result
object returns the rows as dictionaries instead of tuples. This can be useful if you wanted a dictionary format anyhow—for example, if you were going to serialize this data to JSON.
10:07
Just like with the INSERT
statement, you can parameterize a SELECT
statement.
10:27
Here, the WHERE
is saying that I’m looking for authors whose last name is greater than capital "C"
. Because I used a parameter to define this, I could actually have passed in the dictionary "ln":"C"
at runtime.
10:46
There’s the result of the query. Don’t forget that this is doing string comparison. Something that starts with capital "C"
, like "Clancy"
, and is bigger than a single letter, like "Clancy"
, is actually greater than just capital "C"
.
11:02 Capital C, capital C. I’ve got a Nine Inch Nails song going through my head. Anyhow…
11:17
Since you specify the paramaters to a statement as part of the .execute()
method, that means you can pre-prepare a statement and reuse it. Here, I’ve kept the text object in a variable called stmt
.
11:30 This is what it looks like. Not particularly helpful. But I can take this statement and bind parameters to it. Now I can execute …
11:50 and you can see that this was just another way of getting to the same place, but now I can reuse that statement and bind new parameters.
12:09
Capital "B"
it is. I used to stand for something. Trent Reznor’s an angry, angry dude. What was I saying? All done here. So like a good boy, I’ll clean up my connections by calling .close()
.
12:28 I have a very important warning about writing raw SQL. You might be tempted to think, Hey, this is all text. I can just concatenate or use f-strings! And that would be a very bad idea. Doing it this way leaves you open to what’s called an SQL injection attack.
12:45
A clever person might pass you a last_name
that would break your code or worse, do malicious things. You may have noticed that all the SQL calls have ended with a semicolon (;
).
12:55
If I pass in a last_name
here that has a semicolon and then, say, a DELETE
statement, you’d be running both your INSERT
and my DELETE
, and that’s probably not what you want. Parameters solve this problem.
13:08 They properly escape anything you pass in, so you should always use parameters. SQL injection has been around for a long time. I remember finding a vulnerability in a forms product in 1996.
13:21 You’d think us coders would learn, but no, the wrong way is still pretty common. The OWASP Top 10 vulnerabilities list dropped SQL injection to spot three in 2021, which is down from spot one in previous years.
13:36 I’m not sure whether that’s because the coding community is finally waking up to these kinds of exploits or whether it’s just that spots one and two actually just surpassed it.
13:46 Parameters only, no dynamic strings! Got it? Good. And if you want a laugh on this, go look up xkcd and and Little Bobby Tables to see an example.
14:00
SQL’s core has two ways of doing queries. You’ve seen the direct use of SQL through the text
object. Next up, you’ll see the functions and objects that abstract this away.
Christopher Trudeau RP Team on April 25, 2024
Hi Emmanuel,
Sorry to hear you’re having troubles. I’d need to know what exception is happening to be able to help you further. Please paste the create, connection, and full traceback from your REPL into a post so I can see all of it.
Unfortunately, SQLAlchemy’s documentation could be a bit better, and it doesn’t contain a list of what this method can throw, so I’d only be guessing. As it is the connect statement that is having problems, I’d assume it can’t find the database you’re trying to connect to. The most likely problem is the URL for the engine is malformed.
As the create_engine()
call can be used to create databases as well as specify existing ones, you don’t tend to get errors at that call, but at the connection call that follows after.
Become a Member to join the conversation.
Emmanuel Okonya on April 25, 2024
The code started failing for me at the point where I typed:
It returned a Traceback (most recent call last): error which may or may not be related to the version of sqlAlchemy. Not sure how to happily move forward with this one :-)