Linear regression is one of the most widely used statistical methods available today. It is used by data analysts and students in almost every discipline. However, for the standard ordinary least squares method, there are several strong assumptions made about data that is often not true in real world data sets. This can cause numerous problems in the least squares model. One of the most common issues is a model overfitting the data. Ridge Regression and LASSO are two methods used to create a better and more accurate model. I will discuss how overfitting arises in least squares models and the reasoning for using Ridge Regression and LASSO include analysis of real world example data and compare these methods with OLS and each other to further infer the benefits and drawbacks of each method.
We are asked to predict the probability of the event that a student will drop out a course. We firstly extracted many features from the huge dataset. Then we used ensemble learning machine and model stacking technique to get the final result, which ranked the 1st in 68 teams.