Skip to main content

Polynomial Regression in machine learning

What is polynomial Regression??

There are 3 types of Linear Regression Algorithms:

1. Simple Linear Regression

2. Multiple Linear Regression

3. Polynomial Regression

We have already discussed about Simple Linear Regression & Multiple Linear Regression. If you want to know about that check this links out:

Polynomial Regression is a form of linear regression in which at the end we perform linear regression by applying same principles.

It's just that we add polynomial terms in our data set.

Polynomial Regression is used when we have  non-linear data set.



For eg ;

We have X,Y columns in our dataset. In which X is the input column & Y is the output column.

In polynomial regression we extract the polynomial features in the preprocessing stage. That means if we want to create degree = 2 then we will convert X0 ,X1 ,X2
for every row.

Features help us to understand the non linear relationship.

In polynomial regression degree is a hyper parameter that is we have to set it manually.

The problem with this value is if we keep it very low there is a possibility that the graph may under fit and if we keep it too high then it can be over fitting.

For lower number of degree it tends to be under fit and for higher number of degree it tends to be overfit.

This example demonstrates the problems of underfitting and overfitting and how we can use linear regression with polynomial features to approximate nonlinear functions. The plot shows the function that we want to approximate, which is a part of the cosine function. In addition, the samples from the real function and the approximations of different models are displayed. The models have polynomial features of different degrees. We can see that a linear function (polynomial with degree 1) is not sufficient to fit the training samples. This is called underfitting. A polynomial of degree 4 approximates the true function almost perfectly. However, for higher degrees the model will overfit the training data, i.e. it learns the noise of the training data. We evaluate quantitatively overfitting / underfitting by using cross-validation. We calculate the mean squared error (MSE) on the validation set, the higher, the less likely the model generalizes correctly from the training data.





Polynomial regression is applied only to the input columns of the dataset on both training and testing data.


Parameters:

degreeint or tuple (min_degree, max_degree), default=2

If a single int is given, it specifies the maximal degree of the polynomial features. If a tuple (min_degree, max_degree) is passed, then min_degree is the minimum and max_degree is the maximum polynomial degree of the generated features. Note that min_degree=0 and min_degree=1 are equivalent as outputting the degree zero term is determined by include_bias.

interaction_onlybool, default=False

If True, only interaction features are produced: features that are products of at most degree distinct input features, i.e. terms with power of 2 or higher of the same input feature are excluded:

  • included: x[0]x[1]x[0] * x[1], etc.

  • excluded: x[0] ** 2x[0] ** 2 * x[1], etc.

include_biasbool, default=True

If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).

order{‘C’, ‘F’}, default=’C’

Order of output array in the dense case. 'F' order is faster to compute, but may slow down subsequent estimators.

New in version 0.21.

Attributes:
powers_ndarray of shape (n_output_features_n_features_in_)

Exponent for each of the inputs in the output.

n_input_features_int

DEPRECATED: The attribute n_input_features_ was deprecated in version 1.0 and will be removed in 1.2.

n_features_in_int

Number of features seen during fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

n_output_features_int

The total number of polynomial output features. The number of output features is computed by iterating over all suitably sized combinations of input features.


Let's see the code


# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('../input/position-salary-dataset/Position_Salaries.csv')
X = dataset.iloc[:, 1:-1].values
y = dataset.iloc[:, -1]. values

# Displaying X
X

array([[ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10]])

# Displaying y
y

array([  45000,   50000,   60000,   80000,  110000,  150000,  200000,
        300000,  500000, 1000000])

# Training the Linear Regression model on the whole dataset
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)

# Training the Polynomial Regression model on the whole dataset
from sklearn.preprocessing import PolynomialFeatures
poly_reg = PolynomialFeatures(degree = 4)
X_poly = poly_reg.fit_transform(X)
lin_reg_2 = LinearRegression()
lin_reg_2.fit(X_poly, y)

# Visualising the Linear Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg.predict(X), color = 'blue')
plt.title('Truth or Bluff (Linear Regression)')
plt.xlabel('Position Level')
plt.ylabel('Salary')
plt.show()


# Visualising the Polynomial Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg_2.predict(poly_reg.fit_transform(X)), color = 'blue')
plt.title('Truth or Bluff (Polynomial Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
# Predicting a new result with Linear Regression
lin_reg.predict([[6.5]])

# Predicting a new result with Polynomial Regression
lin_reg_2.predict(poly_reg.fit_transform([[6.5]]))

Comments

Popular posts from this blog

Welcome to the Digital Era!!!

As we all know we are living in a Digital Era. Almost everything around us is digitally connected. For example, QR code. Almost everyone uses QR Code for financial transactions in there day to day life. If you see any business that is not on the internet it's like they are missing out on the digital world. No business can grow immensely without creating its digital presence. What is Digital Transformation? Digital transformation is the process of using digital technologies to transform existing traditional and non-digital business processes and services, or creating new ones, to meet with the evolving market and customer expectations, thus completely altering the way businesses are managed and operated, and how value is delivered to customers. To help you stay ahead of the game, we've compiled some of the most valuable insights from today's leading digital businesses.  Some of Indian Startups bloom after Digital Transformation : 1. Lenskart Have you ever thought that you do

Ridge Regression Machine Learning

Bias variance trade off Bias means the inability of a machine learning model to truly capture the relationship in the training data set. That means it cannot understand the pattern in the training data set. Variance is the different of fits on different data sets. The difference between the training and the testing data set is variance. Overfitting When your data set works well on the trading data set but does not perform well on testing data set its called over fitting. Underfitting When your model does not perform well on your training data set then it is called under fitting. There are three methods for controlling over fitting: 1. Regularization 2. Bagging 3. Boosting There are 3 techniques of regularization: 1. Ridge Regression In this we add some more regularization terms to reduce the over fitting. Basically it's  lambda. For performing ridge Regression we have an in-built class Ridge in sklearn Library. Let's see the code : from sklearn.linear_model import LinearRegress

Top 10 business ideas

Top 10 business ideas Hey do you want to become an self made entrepreneur? Do you want to start your own business? If yes then you are at the right place !!! The rise of entrepreneurship in India is unstoppable, and that is something we should be proud of.The wave of entrepreneurship is on it's hype. Here are some business ideas to bloom your career. 1. Online Reselling  If you’re interested in clothing and/or sales, you might consider  starting an online reseller business . Y ou can start your business as a side hustle and turn it into a full-time resale business. Here's your action plan: Choose the right type of reselling business. Identify the industry for your business.  Identify the market and target audience for your business.  Check out your competitors.  Check if the business is viable.  Start your reseller business online. 2. Professional Organizing If you’re a highly organized person who enjoys making spaces functional and comfortable, you might be good at coaching ot