This project develops a Linear Regression model to predict student academic performance in mathematics (final grade G3), using the UCI 'Student Performance' dataset.
Key Highlights: Data Extraction & Cleaning: Data was fetched using the ucimlrepo library. We performed null value checks and excluded non-significant variables such as Pstatus, nursery, romantic, famsize, and guardian based on statistical tests ( 𝑝 -values). Data Preprocessing: Target Encoding was implemented to transform categorical features (like parental jobs or school type) into numerical values suitable for regression. Results: The final model achieved an R-squared ( 𝑅2 ) of 0.85, meaning it explains 85% of the variance in academic performance. Previous grades (G1, G2) and social factors were the strongest predictors. Visualization: Scatter and residual plots confirm high prediction accuracy, showing a clear linear trend and reasonably distributed errors.
I'm open to new opportunities. Let's discuss how I can bring this level of engineering to your team.
Let's Talk