Application of Mathematical Models for Prediction of Air Quality of Select Indian Cities Through Ensemble Learning Approach

Abstract
Clean air is considered an important factor for human and environmental health. Machine learning based prediction models play a prominent role in improving and monitoring the air quality systems and assist in handling different environmental threats. The aim of this research is to develop mathematical models that facilitate ensemble learning for prediction of Air Quality Index (AQI) of select Indian cities. Indian cities studied in this research are Mumbai, Chennai, Bangalore and Delhi. Two models have been developed in this research. The data used for training and testing of the model is accessed from Kaggle database. Air Quality Index is predicted through random forest and gradient boosting models. Performance evaluation of the models has been done by the evaluation metrics RMSE, MAE, MSE and R2. The gradient boosting model obtained a value of 99.92% (0.9992). Similarly, the random forest model obtained 85.7% (0.8574) as the R2 score. It has been identified that Gradient boosting model outperforms random forest model by rendering an accuracy of 99%. At the same time random forest model is found to render only 86% accuracy. In addition to that, conversion of the predictions into AQI categories have been evaluated, post AQI regression, through the classification measure F1 and the confusion matrix and the results have been presented for every city in this research.
Keywords: Air Quality Prediction, Ensemble Model, Gradient Boosting, Machine Learning, Random Forest.

Author(s): R Vetri Selvi*, R Sathish Babu
Volume: 7 Issue: 2 Pages: 1310-1327
DOI: https://doi.org/10.47857/irjms.2026.v07i02.08199