Grouping the Data to Calculate Final Scores

Using pandas to Make a Gradebook in Python Cesar Aguilar 07:41

00:00 To compute the final grade, let’s first define the weights that we’ll use in the final computation. We’ll use a Series for this, and we’re going to name the keys in the dictionary that we’ll pass in.

00:17 These are going to serve as the index for the Series. We want to name them exactly as we’ve named the columns in the DataFrame that store the exam scores, the homework score, and the quiz score.

00:31 So we’ll have "Exam 1 Score" and the percentage there was 5%. Then we’ll also have "Exam 2 Score", and in this case, it was 10%. Then "Exam 3 Score", this was 15%.

00:54 Then the "Quiz Score",

00:57 this was going to be 30%. Then the "Homework Score"

01:03 was 40%.

01:06 And so now what we need to do is from the main DataFrame pull out those five columns, multiply them by the weights—each column—and then just add up the columns. All right, so let’s run that.

01:22 And what we’re saying is from the final DataFrame, let’s pull out the index associated with the weights,

01:32 because these are the columns that we need. Exam 1, 2, 3, quiz score, and homework score. Then we simply multiply them by the weights, and so these are just going to be these numbers that are less than 1.

01:48 And we want to sum each row, and so let’s sum with axis=1. And that would be it! So this would give us the final score for each individual student.

02:05 Let’s call this, say, in the final DataFrame. We’ll introduce a new column and this will simply be, say, 'Final Score'. All right. Now, likely we’re going to want to input these scores by putting in an integer and not a decimal number, not something less than one.

02:26 So what we’ll do is we’ll take this 'Final Score' column, multiply it by 100—so we’ll take each of those scores and multiply by 100—and then finally we get to use the NumPy module.

02:41 We’ll be rounding those up, and so we’ll have 75.0, 80.0, and so on. Let’s call this the 'Ceiling Score'.

02:54 This is the rounded score.

02:58 Now, it’s likely that once we have these ceiling scores, or these rounded scores, we need to compute a letter grade, which will actually be the final data that we’re going to have to input in some sort of system at, say, the registrar’s office.

03:12 What we’ll do is we’ll simply define a function that will compute the letter grade based on the actual numerical grade. Let’s define a utility function. We’ll call it, say, get_letter_grade().

03:27 It’ll take in a score or a grade and then simply if the score is greater or equal to, say, 90, then this is going to be an 'A'.

03:40 If it greater than, say, 80 and including 80, then we’ll return a 'B',

03:49 and then we’ll sort of keep doing this filtering down of the score. If it’s a 70, then this will be a 'C',

04:00 greater than 60 or equal to 60, this will be a 'D', and then lastly, anything under a 60 is going to be an 'F'.

04:12 Hopefully, we don’t have too many F’s in our course. All right, so score comes in. If it’s greater than or equal to 90, that’s an 'A'.

04:20 If it’s greater or equal to 80, that’s a 'B', and so on all the way down to an 'F'. So, we’ll define this utility function.

04:27 Then in the 'Ceiling Score' column,

04:33 we’re going to use the .map() method. We can map this function at every cell in this Series. If we take a look at this, we’re going to be getting the letter grades for each individual student.

05:46 The data for this is this newly computed letter_grade Series, and the categories are just the letters. The lowest grade is going to be 'F' and then the next is 'D', 'C', 'B', and 'A'.

06:31 There is a keyword argument called ordered and the default is False but we actually, in this case, want it to be True because we do want the values of this series to be interpreted or have a relationship of being ordered.

06:47 So let’s run that, and then if we just take a look at the Series

06:54 with all of the grades, we see that we’ve got a data type as category, we’ve got all the letter grades, and the lowest grade is 'F', and then 'D', 'C', 'B', and 'A'. So for the purposes of computing the final grade for each student, this pretty much does it.

07:12 And then maybe one last thing that we want to do, you know, once we’ve computed the final grades, we’re probably going to have to upload the data somewhere.

07:22 We’ll have to do this, possibly, via section. We have all of these students and there are three sections, and so maybe the last thing that we want to do is to create CSV files containing the grades for each of the individual sections.

07:37 We’ll do that next.

Become a Member to join the conversation.