“Serena” Quasar’s sparcl ID: 719e7ea6-8b79-11ef-be93-525400f334e1
Fetching Quasar Spectral Data
00:00 In the previous lesson, I introduced you to Astro Data Lab’s Sparcl client to search the vast reaches of outer space. In this lesson, I’ll show you how to take a sparcl ID and get the corresponding spectral data.
00:13 The query in the previous lesson let you look for quasars. Once you have one, you can get more information from Sparcl using the sparcl ID that goes with it.
00:23
Using this ID, you call the sparcl client’s retrieve()
method to get the wavelength and flux pairs. And once you have it, you can save it to a CSV file, which is how I generated the one I used before.
00:35 Let’s go query space. I am going to use Polars to manage the data, so I have to import that first, and of course I’ll need the client once more to run a query.
00:55 This is the sparcl ID for Serena. Remember, I found it by running the search command in the previous lesson, and when I say I, I mean my astronomer did it for me and picked a good example.
01:06 Doesn’t everyone have an astronomer on call? It’s the latest trend.
01:14 Connected to the client again, and once more got a deprecation warning, which I don’t care about.
01:22
And the retrieve()
method is what you use to get the spectrum data. The first argument is a list of sparcl IDs. I’m only after Serena, but you can fetch more than one at a time if you like.
01:38
The include
argument says which fields you want from the dataset. This is like the outfields
argument from before. Nobody said their interface had to be consistent.
01:47
I’m including the sparcl_ID
to make sure the query was for what I wanted. It’s more important if you’re querying several at a time. I’m also asking for the redshift, which is noted in the database, which as I said before, might or might not be correct.
02:02 And more importantly, the data I’m after, which is the wavelength and flux data that I’m going to stick in the CSV. Like before, this call returns a wrapper with records inside of it.
02:14 As I only queried one quaser, I’m only getting one record back.
02:21
And similar to the other call, records
is a list of dictionaries. They may not be consistent with argument names, but they’re not bad with response structures.
02:29 I want to use Polars to save the appropriate data into a CSV file. So first I need to get to the data from the records into a DataFrame.
02:41 Quick little shortcut for the record itself,
02:47 and not that I’ll be using it here, but remember this number. Once the dashboard is complete, you can compare your computed value with what the database thinks the quasar’s redshift is.
02:58 Let me create the DataFrame.
03:02 You can initialize the columns in a DataFrame by passing in a dictionary.
03:12
The first column is the wavelength. A few lines back on the screen where I printed out the record, you can see that what came back for wavelength
was an array object.
03:21 That’s actually a NumPy array, which Polars knows how to turn into a column, so I just have to reference it here.
03:32
Same idea for flux
, which will be my second column. And there I go. Same 7,781 rows as the CSV file. Imagine that. The last step is to call the DataFrame’s write_csv()
method to store this out.
03:55 I’ve been sticking with CSV to keep things simple, but note that CSV doesn’t store data types. Everything in a CSV file is text. That said, Polars automatically converts numeric columns to the right type if it can detect that when reading the CSV.
04:11 For more complex situations, you should consider using a file format that does store the data type. Excel is a common choice. Polars supports it with the addition of a third-party library, which is why I kept it simple for our case. There were already enough libraries to consider for this course.
04:30 You’re probably itching to get to the dashboard, but you’re still missing some data. In the next lesson, I’ll show you how to get the spectral lines that you’ll overlay on the graph.
Become a Member to join the conversation.