Simple Polynomial Regression: Code
Here’s the data that you’ll need to implement Simple Polynomial Regression in Python:
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([15, 11, 2, 8, 25, 32])
00:00
To implement polynomial regression in Python using sklearn
module, we’ll start off as we’ve done before. We’re going to import NumPy, and then we’re going to import the LinearRegression
class from sklearn.linear_model
module, and then for polynomial regression to generate the polynomial terms that we’ll need to fit the model, we’re going to import a new class from sklearn
and more particular, it’s a class from the submodule preprocessing
,
00:33
and the class is called PolynomialFeatures
.
00:38
And the main purpose of the class PolynomialFeatures
is that it’s going to help us to generate that feature matrix that consists of all those polynomial combinations of the input variables with a specified degree or degree up to the degree that we’ll specify.
00:56 So it’s going to generate all of the mixed terms and and all of the peer terms that involve just each of the individual components in the input variable.
01:05
Let’s first run that. All right. It looks like this time, I forgot an s
. So it’s called PolynomialFeatures
.
01:14 And so now I’m going to paste some test data. As before, you can use the data that’s provided in the notes that accompany the video lesson.
01:21
So now that extra step: we want to instantiate a instance of this PolynomialFeatures
class. We’ll call it transformer
. And this is from the PolynomialFeatures
class, and we’re going to be using a quadratic regression model.
01:39
So we’ll pass in the positional argument. That’s the only required one. It’s called degree
. And then there is a keyword argument called include_bias
.
01:50
Now the include_bias
is going to be by default True
, and we want to set it to False
. And what this will do is by setting it to False
, it will generate that x matrix so that it doesn’t contain that column of all ones.
02:08
And sometimes we want to have that, depending on how we just formulate the regression model. So there we’ve got this transformer
. Let me just run, and then you’ll see what this does.
02:19 So we want to generate that feature matrix. It’s that matrix with the quadratic terms, all the way up to the quadratic terms. So let me just run this so you can see what this is. All right.
02:30 So I forgot my r this time. Run that again, and let’s run this again. So there you go. What this is is it’s taking our input array, right? Let’s print this out. This is two-dimensional. We’ve got our inputs, the observations. We’ve got six of them.
02:52 And then what this first column is, is exactly that x. And then these are just the quadratic terms, so this is five squared, fifteen squared, twenty-five squared, and so on. Now, we only have one input, one scalar, so the input is scalar, only has one column, one dimension.
03:11 We’ve got six different observations. We’re only going to have here x squared. We’ll take a look at how, when we have multiple inputs, then we’re going to have those mixed terms as well.
03:22 But for now that is our x matrix. So what we can do is let’s just call this x̄. And when we don’t include the bias, then we don’t get that column of ones.
03:33
So if I keep this as its default value, True
—and let’s print it—
03:40
then you see that we get all the ones here. Now, depending on how you set things up, you may not want those ones there. So we’re going to set this to False
, and then we’ll just keep it like we had before. So now we’ve got everything.
03:56
So the idea with the quadratic term and doing a polynomial regression is that we’re doing a preprocessing of the observations, of the input observations: 5
, 15
, 25
, and so on. These were all the actual observations.
04:10 So if you’re modeling some phenomenon, maybe you’re wanting to predict the values of home prices, and your input is maybe just, say, distance to schools, right?
04:21
And if over here you’ve also wanted to incorporate a quadratic model, you have to manually create these values. And that’s exactly what the .fit_transform()
method on this instance of transformer
will do for you. So now, now we’ve got everything we need to build the model just like before.
04:40
So we’re going to be calling LinearRegression
, and we’re going to fit the model using this transformed x that contains up to the quadratic terms.
04:51 And then we can take a look at all of the
04:55
coefficients. So we’ve got the model.intercept_
,
04:59
and we also have the model.coef_
as before.
05:09
And we can also take a look at the R² value. So model.score()
this, and here we have to use the transformed. All right, so that’s a pretty good R² value. And then we can also take a look at, say, the estimated y values for our observed inputs.
05:30 So let’s take a look at the
05:34 estimated values for x̄. Now here, we do have to pass in the transformed x values, because remember we’re using a linear model that contains two features, two components, for the input because we’ve done this using sort of this idea that we’re converting this quadratic term into an extra independent variable, and so we really have two features for our input.
05:59
And then why don’t we just take a look at the differences between y_est
and the actual y
values. And usually you put y
first and then the estimates. And we’ll print those out. So this is computing the residuals of the responses using the observed inputs. All right, so that wraps the lesson up on implementing polynomial regression for a single-variable regression model. In the next lesson, we’ll take a look at how we can do polynomial regression with multiple inputs.
Become a Member to join the conversation.