Student Retention Modeling
EducationBusiness Impacts
Increased Student Engagement
95%
Model accuracy
Reduced Number of Dropouts
Customer Key Facts
- Location : North America
- Industry : Education Management
Problem Context
With the increased adoption and popularity of virtual classrooms, many universities are struggling to efficiently utilize data to promote engagement with students and minimize the attrition rate. Both of the customer’s campus-based and online programs had high dropout rates across different courses within the first few weeks of enrollment. They wanted to implement proactive measures to identify the reasons behind students who drop out, determine the risk factors undermining student engagement, and intervene at an early stage to reduce attrition and improve overall performance.
Challenges
- Performing Exploratory Data Analysis to handle missing values, perform data type conversion (from numerical to categorical, etc.)
- Final feature selection and identifying relevant and important features and attributes for model building
- Testing and validation of a variety of models to identify the best performing model to provide maximum accuracy
Technologies Used
Google Compute Engine
Google's BigQuery
Google Cloud Storage
Google Cloud Composer
Google Kubernetes Engine
Google Cloud Dataproc
JupyterLab
Increasing Student Engagement and Reducing Attrition with Machine Learning
Solution
Quantiphi built a multivariate rescoring model to help predict the likelihood of a student dropping out of the course and also identify the important factors driving the student’s dropout rate. The on-prem student dataset obtained in MS SQL format was migrated to Google's BigQuery and leveraged to train the model in making predictions. General transformations, such as binning the similar columns using the six sigma rule and implementing outlier analysis, were performed to suit the model building process. The solution highlights the metric or variable(s) defining the student’s dropout rate, predicts success probability, and uses the output of the model as an input to modulate operational policies to give students more targeted engagement support, while also helping the customer significantly improve retention rates.
Result
- Robust & scalable architecture
- Identification of both high-risk and low-risk students
- More targeted engagement/support for student success