Regression in Machine Learning
INTRODUTION TO REGRESSION IN MACHINE LEARNING:
The following article provides an outline for Regression in Machine Learning. Regression means to predict the value using the input data. Regression models are used to predict a continuous value. It is mostly used to find the relationship between the variables and forecasting. Regression models differ based on the kind of relationship between dependent and independent variables.
Types of Regression in Machine Learning:
There are different types of regression:
- Simple Linear Regression: Simple linear regression is a target variable based on the independent variables. Linear regression is a machine learning algorithm based on supervised learning which performs the regression task.
- Polynomial Regression: Polynomial regression transforms the original features into polynomial features of a given degree or variable and then apply linear regression to it.
- Support Vector Regression: Support vector regression identifies a hyperplane with the maximum margin such that the maximum number of data points is within the margin.
- Decision Tree Regression: The decision tree is a tree that is built by partitioning the data into subsets containing instances with similar values. It can use for regression and classification also.
- Random Forest Regression: Random forest is an ensemble approach where we take into account the predictions of several decision regression trees.
Implementation of Linear Regression in Machine Learning
Linear regression is employed in varied ways in which a number of them are listed as:
- Sales prognostication
- Risk analysis
- Housing applications
- Finance applications
The process used for implementing the statistical regression whereas exploitation it in many ways in which some are mentioned below:
- Loading the data
- Exploring the data
- Slicing the data
- Train and split data
- Generate the model
- Evacuate the accuracy
Advantages and Disadvantages of Linear Regression
- Linear regression performs well when the data set is linearly separable. We can use it to find the nature of the relationship between the variables.
- It is easier to implement, interpret and very efficient to train.
- It is prone to over-fitting but it can be easily avoided using some dimensionality reduction techniques, regularization techniques, and cross-validation.
- It has the extrapolation beyond the specific data set.
- Linear assumption: It assumes that the relationship between the input and the output is linear.
- Remove noise: It assumes that the input and the output variables are not noisy.
- Remove collinearity: It will over-fit the data when we have highly correlated input variables.
- Gaussian distributions: It will create a lot of reliable predictions if the input and output variables have a Gaussian distribution.
- Resize inputs: It usually creates a lot of reliable predictions if we tend to use resize input variables exploitation standardization or social control.
- Susceptible to outliers: It is very sensitive to outliers. So, the outliers need to be removed before applying the linear regression to the data set.