Polynomial Regression: Background

Starting With Linear Regression in Python Cesar Aguilar 07:47

00:00 Linearity is mathematically the nicest case that you can have. However, sometimes you may want to use higher order terms to see whether incorporating them might give you a better model for your phenomenon.

00:13 In this lesson, we’ll take a look at polynomial regression.

00:17 In polynomial regression with only one independent variable, what we’re seeking is a regression model that contains not only the linear term, but also possibly a quadratic term, a cubic term, and then a term up to some higher order, say x to the power of k.

01:04 So you’ve got the two linear terms for both of the components for the input x₁ and x₂, but then you’re going to have a quadratic term for x₁, a quadratic term for x₂, and then this term that involves the multiplication of the component x₁ and x₂.

01:20 And this type of term is called a mixed term. So x₁x₂, we’re mixing the two components of the input.

01:31 When you go up to higher order regression models with many variables, the notation gets a little bit more complicated, but you can write down the model in a similar way using multi-variate notation.

01:43 For the purpose of this course, though, we’re going to stick with quadratic and cubic models.

01:49 They may seem a bit more complicated, but in actuality, polynomial regression problems, they can be solved using the same ideas from linear regression, which is kind of cool. So for example, say you wanted to solve this cubic model.

02:04 So you want to define a regression model that involves only one single input variable. So we’ve got a scalar problem. But we want to not just get linear terms. We want quadratic and cubic terms.

02:17 So how are we going to solve this regression model? Remember in the background, we’ve got observation data. So we’ve got observations for the input x and the corresponding observation for the response, y. We don’t have measurements for x² and x³ though.

02:33 Well, the idea is going to be the following: we want to convert this cubic regression problem, so involving x³ and up to x², and we want to convert it into a linear regression problem in a multiple variable scenario.

02:49 So what we’ll do is we’ll think of x² and x³ as new independent variables. Although they sort of implicitly depend or explicitly depend on the input x, we want to treat them as independent variables.

03:05 And then what we do is through introduction of these dummy variables—z₁, z₂, and z₃—we can view this cubic model in a single variable up here, in the form of a linear model involving multi-variables, so the variables being z₁, z₂, and z₃.

03:24 And so now this regression problem looks just like the multiple linear regression problems that we did in the last two previous lessons.

04:47 So let’s take a look through some figures of what we’re trying to achieve by using polynomial regression on some test data.

04:56 So here we’ve got four figures that represent four different degrees for a polynomial regression model on some hypothetical data that contains six observations, and these are again represented by these green dots.

05:11 So in all four figures, the green dots are at the same place because these all correspond to the same observation, and the only thing that we’re changing is the degree of the regression model.

05:23 In the top left corner, we’ve got a linear model, so this is degree one. Degree two, so a quadratic or a parabolic model. And then at the bottom left, we’ve got a cubic model, and down here in the bottom right, we’ve got a fifth-order polynomial regression model.

05:40 There’s one very important question that arises when you’re implementing polynomial regression, and it has to do with the choice of the optimal degree of the polynomial regression model.

05:51 There’s no straightforward rule on choosing the degree of your polynomial in a polynomial regression model, but there are however, a couple of important things that you should keep in mind when implementing polynomial regression.

06:03 And that is the issues of underfitting or overfitting the data. In underfitting, what that means is that your phenomenon that you’re trying to model really does have some nonlinear effects, so in other words, the input and the output can’t be modeled well, or aren’t modeled well with a linear model.

06:23 And so the linear model can’t really detect the dependency between the input and the output response, and these are going to correspond to some low R² values.

06:35 At the other extreme, is when you choose a sufficiently high polynomial regression model so that the model that you get can exactly estimate the responses at the input values of the observed data.

Become a Member to join the conversation.