Bias variance trade off
Bias means the inability of a machine learning model to truly capture the relationship in the training data set.
That means it cannot understand the pattern in the training data set.
Variance is the different of fits on different data sets. The difference between the training and the testing data set is variance.
Overfitting
When your data set works well on the trading data set but does not perform well on testing data set its called over fitting.
Underfitting
When your model does not perform well on your training data set then it is called under fitting.
There are three methods for controlling over fitting:
1. Regularization
2. Bagging
3. Boosting
There are 3 techniques of regularization:
1. Ridge Regression
In this we add some more regularization terms to reduce the over fitting. Basically it's lambda.
Let's see the code :
from sklearn.linear_model import LinearRegression
[27.82809103] -2.29474455867698
from sklearn.linear_model import Ridgerr = Ridge(alpha=10) rr.fit(X,y) print(rr.coef_) print(rr.intercept_)rr1 = Ridge(alpha=100) rr1.fit(X,y) print(rr1.coef_) print(rr1.intercept_)plt.plot(X,y,'b.') plt.plot(X,lr.predict(X),color='red',label='alpha=0')plt.plot(X,rr.predict(X),color='green',label='alpha=10') plt.plot(X,rr1.predict(X),color='orange',label='alpha=100') plt.legend()Output
Points to keep in mind while doing Ridge Regression:
1. How the coefficients get affected??
All the coefficients get shrink towards zero but never be zero.
2. Higher values are impacted more
3. Bias variance trade off
Bais variance depends on the value of lambda.
If you keep the value of lambda low then bias will reduce & variance will increase.
If you keep the value of lambda high then bias will increase & variance will reduce.
4. Impact on Loss function.
Comments
Post a Comment