Simple Polynomial Regression: Code

Starting With Linear Regression in Python Cesar Aguilar 06:35

Here’s the data that you’ll need to implement Simple Polynomial Regression in Python:

x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([15, 11, 2, 8, 25, 32])

00:33 and the class is called PolynomialFeatures.

00:38 And the main purpose of the class PolynomialFeatures is that it’s going to help us to generate that feature matrix that consists of all those polynomial combinations of the input variables with a specified degree or degree up to the degree that we’ll specify.

00:56 So it’s going to generate all of the mixed terms and and all of the peer terms that involve just each of the individual components in the input variable.

01:05 Let’s first run that. All right. It looks like this time, I forgot an s. So it’s called PolynomialFeatures.

01:14 And so now I’m going to paste some test data. As before, you can use the data that’s provided in the notes that accompany the video lesson.

01:21 So now that extra step: we want to instantiate a instance of this PolynomialFeatures class. We’ll call it transformer. And this is from the PolynomialFeatures class, and we’re going to be using a quadratic regression model.

01:39 So we’ll pass in the positional argument. That’s the only required one. It’s called degree. And then there is a keyword argument called include_bias.

01:50 Now the include_bias is going to be by default True, and we want to set it to False. And what this will do is by setting it to False, it will generate that x matrix so that it doesn’t contain that column of all ones.

02:08 And sometimes we want to have that, depending on how we just formulate the regression model. So there we’ve got this transformer. Let me just run, and then you’ll see what this does.

02:19 So we want to generate that feature matrix. It’s that matrix with the quadratic terms, all the way up to the quadratic terms. So let me just run this so you can see what this is. All right.

02:30 So I forgot my r this time. Run that again, and let’s run this again. So there you go. What this is is it’s taking our input array, right? Let’s print this out. This is two-dimensional. We’ve got our inputs, the observations. We’ve got six of them.

02:52 And then what this first column is, is exactly that x. And then these are just the quadratic terms, so this is five squared, fifteen squared, twenty-five squared, and so on. Now, we only have one input, one scalar, so the input is scalar, only has one column, one dimension.

03:11 We’ve got six different observations. We’re only going to have here x squared. We’ll take a look at how, when we have multiple inputs, then we’re going to have those mixed terms as well.

03:22 But for now that is our x matrix. So what we can do is let’s just call this x̄. And when we don’t include the bias, then we don’t get that column of ones.

03:33 So if I keep this as its default value, True—and let’s print it—

03:40 then you see that we get all the ones here. Now, depending on how you set things up, you may not want those ones there. So we’re going to set this to False, and then we’ll just keep it like we had before. So now we’ve got everything.

03:56 So the idea with the quadratic term and doing a polynomial regression is that we’re doing a preprocessing of the observations, of the input observations: 5, 15, 25, and so on. These were all the actual observations.

04:10 So if you’re modeling some phenomenon, maybe you’re wanting to predict the values of home prices, and your input is maybe just, say, distance to schools, right?

04:21 And if over here you’ve also wanted to incorporate a quadratic model, you have to manually create these values. And that’s exactly what the .fit_transform() method on this instance of transformer will do for you. So now, now we’ve got everything we need to build the model just like before.

04:40 So we’re going to be calling LinearRegression, and we’re going to fit the model using this transformed x that contains up to the quadratic terms.

04:51 And then we can take a look at all of the

04:55 coefficients. So we’ve got the model.intercept_,

04:59 and we also have the model.coef_ as before.

05:09 And we can also take a look at the R² value. So model.score() this, and here we have to use the transformed. All right, so that’s a pretty good R² value. And then we can also take a look at, say, the estimated y values for our observed inputs.

05:30 So let’s take a look at the

05:34 estimated values for x̄. Now here, we do have to pass in the transformed x values, because remember we’re using a linear model that contains two features, two components, for the input because we’ve done this using sort of this idea that we’re converting this quadratic term into an extra independent variable, and so we really have two features for our input.

Become a Member to join the conversation.