Forecast of NO2 concentrations based on coupled air quality model simulations and monitoring data using machine learning method
HUANG Yong-xi1, ZHU Yun1, XIE Yang-hong2, LI Hai-xian2, ZHANG Zhi-cheng1, LI Jie1, LI Jin-ying1, YUAN Ying-zhi1
1. Guangdong Provincial Key Laboratory of Atmospheric Environment and Pollution Control, College of Environment and Energy, South China University of Technology, Guangzhou 510006, China; 2. Foshan Ecology and Environment Bureau, Shunde Branch, Foshan 528300, China
Abstract:In this study, built upon the WRF-CMAQ air quality model simulations, a novel machine learning method based on simulations and observations (SOML) that integrating feedforward neural network (FNN) and long short-term Memory network (LSTM) through the Lasso method was developed for forecasting NO2 concentrations, where LSTM was derived based on real-time pollutant and meteorological data. This innovative method was then applied to forecast the NO2 concentrations for three consecutive days for ten air quality monitoring stations in Shunde, Foshan to evaluate the model performance. Our results show that: Compared to WRF-CMAQ and other individual models, SOML gave higher accuracy in the three-day forecast of NO2 concentrations, with the mean absolute error (MAE) of first day at 4.99μg/m3, decreasing up to 66.18%; The accuracy of SOML predictions has significantly improved compared with that of WRF-CMAQ, indicating SOML’s suitable applicability to all seasons (MAE decreased by 42.18%, 42.89%, 61.04% and 50.91%, respectively), particularly in autumn and winter; and Compared with WRF-CMAQ, SOML appears to provide better forecasting accuracy of the spatial distribution as well as the NO2 concentration levels at each station in Shunde.
黄泳熙, 朱云, 谢阳红, 李海贤, 张志诚, 黎杰, 李金盈, 袁颖枝. 空气质量模拟与观测机器学习NO2浓度预报[J]. 中国环境科学, 2023, 43(12): 6225-6234.
HUANG Yong-xi, ZHU Yun, XIE Yang-hong, LI Hai-xian, ZHANG Zhi-cheng, LI Jie, LI Jin-ying, YUAN Ying-zhi. Forecast of NO2 concentrations based on coupled air quality model simulations and monitoring data using machine learning method. CHINA ENVIRONMENTAL SCIENCECE, 2023, 43(12): 6225-6234.
[1] Zhu Y, Zhan Y, Wang B, et al. Spatiotemporally mapping of the relationship between NO2 pollution and urbanization for a megacity in Southwest China during 2005~2016[J]. Chemosphere, 2019,220:155-162. [2] Vîrghileanu M, Săvulescu I, Mihai B, et al. Nitrogen Dioxide (NO2) Pollution Monitoring with Sentinel-5P Satellite Imagery over Europe during the Coronavirus Pandemic Outbreak[J]. Remote Sensing, 2020,12(21):3575. [3] Huang S, Li H, Wang M, et al. Long-term exposure to nitrogen dioxide and mortality: A systematic review and meta-analysis[J]. Science of the Total Environment, 2021,776:145968. [4] Chen J, Jiang Z, Li R, et al. Large discrepancy between observed and modeled wintertime tropospheric NO2 variabilities due to COVID-19controls in China[J]. Environmental Research Letters, 2022,17(3):35007. [5] Chi Y, Fan M, Zhao C, et al. Machine learning-based estimation of ground-level NO2 concentrations over China[J]. Science of the Total Environment, 2022,807:150721. [6] Lei M, Monjardino J, Mendes L, et al. Statistical forecast applied to two macao air monitoring stations[J]. IOP Conference Series. Earth and Environmental Science, 2020,489(1):12018. [7] Navares R, Aznarte J L. Predicting air quality with deep learning LSTM: Towards comprehensive models[J]. Ecological Informatics, 2020,55:101019. [8] Mao W, Jiao L, Wang W. Long time series ozone prediction in China: A novel dynamic spatiotemporal deep learning approach[J]. Building and environment, 2022,218:109087. [9] Wu Q, Lin H. A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors[J]. Science of The Total Environment, 2019,683:808-821. [10] Zhao Z, Wu J, Cai F, et al. A statistical learning framework for spatial-temporal feature selection and application to air quality index forecasting[J]. Ecological Indicators, 2022,144:109416. [11] 李娜.基于EEMD-LSTM-ARIMA的兰州市空气质量预测研究[D]. 兰州:兰州财经大学, 2022. Li N. Lanzhou city based on EEMD-LSTM-ARIMA air quality prediction study[D]. Lanzhou: Lanzhou University of Finance and Economics, 2022. [12] 王茜,吴剑斌,林燕芬.CMAQ模式及其修正技术在上海市PM2.5预报中的应用检验[J]. 环境科学学报, 2015,35(6):1651-1656. Wang Q, Wu J, Lin Y. Implementation of a dynamic linear regression method on the CMAQ forecast of PM2.5 in Shanghai[J]. Acta Scientiae Circumstantiae, 2015,35(6):1651-1656. [13] Yan R, Liao J, Yang J, et al. Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering[J]. Expert Systems with Applications, 2021,169:114513. [14] Zhang Y, Bocquet M, Mallet V, et al. Real-time air quality forecasting, part I: History, techniques, and current status[J]. Atmospheric Environment, 2012,60:632-655. [15] Bai L, Wang J, Ma X, et al. Air pollution forecasts: An overview[J]. International Journal of Environmental Research and Public Health, 2018,15(4):780. [16] Qiao Z, Cui S, Pei C, et al. Regional predictions of air pollution in guangzhou: Preliminary results and multi-model cross-validations[J]. Atmosphere, 2022,13(10):1527. [17] Zhang Y, Bocquet M, Mallet V, et al. Real-time air quality forecasting, part II: State of the science, current research needs, and future prospects[J]. Atmospheric Environment, 2012,60:656-676. [18] Sayeed A, Eslami E, Lops Y, et al. CMAQ-CNN: A new-generation of post-processing techniques for chemical transport models using deep neural networks[J]. Atmospheric Environment, 2022,273:118961. [19] Meng X, Wang W, Shi S, et al. Evaluating the spatiotemporal ozone characteristics with high-resolution predictions in mainland China, 2013~2019[J]. Environmental Pollution, 2022,299:118865. [20] 黄丛吾,陈报章,马超群,等.基于极端随机树方法的WRF- CMAQ-MOS模型研究[J]. 气象学报, 2018,76(5):779-789. Huang C W, Chen B Z, Ma C Q, et al. WRF-CMAQ-MOS studies based on extremely randomized trees[J]. Acta Meterologica Sinica, 2018,76(5):779-789. [21] Petetin H, Bowdalo D, Bretonnière P, et al. Model output statistics (MOS) applied to Copernicus Atmospheric Monitoring Service (CAMS) O3 forecasts: trade-offs between continuous and categorical skill scores[J]. Atmospheric Chemistry and Physics, 2022,22(17): 11603-11630. [22] Sayeed A, Eslami E, Lops Y, et al. CMAQ-CNN: A new-generation of post-processing techniques for chemical transport models using deep neural networks[J]. Atmospheric Environment, 2022,273:118961. [23] Catalano M, Galatioto F. Enhanced transport-related air pollution prediction through a novel metamodel approach[J]. Transportation Research Part D: Transport and Environment, 2017,55:262-276. [24] Zhou H, Zhang F, Du Z, et al. A theory-guided graph networks based PM2.5 forecasting method[J]. Environmental Pollution, 2022,293: 118569. [25] Liu B, Yu X, Chen J, et al. Air pollution concentration forecasting based on wavelet transform and combined weighting forecasting model[J]. Atmospheric Pollution Research, 2021,12(8):101144. [26] 肖宇.基于多机器学习算法耦合的空气质量数值预报订正方法研究及应用[J]. 环境科学研究, 2022,35(12):2693-2701. Xiao Y. Research and application of an ensemble forecasting method based on coupled multi-machine learning algorithms[J]. Research of Environmental Sciences, 2022,35(12):2693-2701. [27] Li M, Liu H, Geng G, et al. Anthropogenic emission inventories in China: a review[J]. National Science Review, 2017,4(6):834-866. [28] Chen W, Li H, Zhu Y, et al. Impact Assessment of Energy Transition Policy on Air Quality over a Typical District of the Pearl River Delta Region, China[J]. Aerosol and Air Quality Research, 2022,22(7):220071. [29] Chen Y, Zhu Y, Lin C, et al. Response surface model based emission source contribution and meteorological pattern analysis in ozone polluted days[J]. Environmental Pollution, 2022,307:119459. [30] 程兴宏,刁志刚,胡江凯,等.基于CMAQ模式和自适应偏最小二乘回归法的中国地区PM2.5浓度动力-统计预报方法研究[J]. 环境科学学报, 2016,36(8):2771-2782. Cheng X H, Diao Z G, Hu J K, et al. Dynamical-statistical forecasting of PM2.5 concentration based on CMAQ model and adapting partial least square regression method in China[J]. Acta Scientiae Circumstantiae, 2016,36(8):2771-2782. [31] 修晨.2017年佛山市顺德区PM2.5污染过程特征及改善策略[J]. 广东化工, 2018,45(16):54-56. Xiu C. Characteristics of the PM2.5 Pollution process of 2017 in shunde district of foshan and the improvement strategy[J]. Guangdong Chemical Industry. 2018,45(16):54-56. [32] 叶玉杰.基于ARIMA-LSTM混合模型的短期空气质量预测[D]. 天津:天津商业大学, 2022. Ye Y J. Short-term air quality prediction based on ARIMA-LSTM hybrid model[D]. Tianjin: Tianjin University of Commerce, 2022. [33] 赵前矩.基于RF-CRNN模型的上海空气质量指数的预测[D]. 上海:上海师范大学, 2022. Zhao Q J. Prediction of Shanghai air quality index based on RF-CRNN model[D]. Shanghai: Shanghai Normal University, 2022. [34] Ghasemi A, Amanollahi J. Integration of ANFIS model and forward selection method for air quality forecasting[J]. Air Quality, Atmosphere & Health, 2019,12(1):59-72. [35] 陈乾.基于随机分布式贪心算法的变量选择[D]. 上海:华东师范大学, 2019. Chen Q. Feature selection based on stochastic distributed greedy algorithm[D]. Shanghai: East China Normal University, 2019. [36] Ojha V K, Abraham A, Snášel V. Metaheuristic design of feedforward neural networks: A review of two decades of research[J]. Engineering Applications of Artificial Intelligence, 2017,60:97-116. [37] Sousa S, Martins F, Alvimferraz M, et al. Multiple linear regression and artificial neural networks based on principal components to predict ozone concentrations[J]. Environmental Modelling & Software, 2007, 22(1):97-103. [38] 陈沛,刘文奇,郑万波.基于LSTM和FNN的昆明市气候舒适度相关气象指标预测方法[J]. 计算机应用, 2021,41(S2):113-117. Chen P, Liu W, Zheng W. Prediction method of related meteorological indexes of Kunming climate comfort based on LSTM and FNN[J]. Journal of Computer Applications, 2021,41(S2):113-117. [39] Rodriguez-Galiano V, Mendes M P, Garcia-Soldado M J, et al. Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: A case study in an agricultural setting (Southern Spain)[J]. Science of The Total Environment, 2014,476-477:189-206. [40] Chen G, Chen J, Dong G, et al. Improving satellite-based estimation of surface ozone across China during 2008~2019 using iterative random forest model and high-resolution grid meteorological data[J]. Sustainable Cities and Society, 2021,69:102807. [41] Matin S S, Hower J C, Farahzadi L, et al. Explaining relationships among various coal analyses with coal grindability index by Random Forest[J]. International Journal of Mineral Processing, 2016,155: 140-146. [42] Cheng Z, Zhang S, Zhang Z. Predictive control for coke oven blowing cooler system based on SVR[C]. IEEE, 2019. [43] 尹博文,张亚娟,王晓芳,等.基于支持向量回归与LSTM的城市PM2.5预测[J]. 河北工业大学学报, 2022,51(3):1-9. Yi B W, Zhang Y J, Wang X F, et al. Urban PM2.5 forecasting based on support vector regression and LSTM[J]. Journal of Hebei University of Technology, 2022,51(3):1-9. [44] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System[C]. Ithaca: ACM, 2016. [45] 胡占占,陈传法,胡保健.基于时空XGBoost的中国区域PM2.5浓度遥感反演[J]. 环境科学学报, 2021,41(10):4228-4237. Hu Z Z, Chen C F, Hu B J. Estimating PM2.5 concentrations across China based on space-time XGBoost approach[J]. Acta Scientiae Circumstantiae, 2021,41(10):4228-4237. [46] 周恒左,陈恒蕤,廖鹏,等.兰州市CMAQ近地面臭氧模拟结果的订正方法研究——基于机器学习方法[J]. 中国环境科学, 2022,42(12):5472-5483. Zhou H Z, Chen H R, Liao P, et al. A study on the revision method of CMAQ ozone prediction results in Lanzhou City — Based on machine learning methods[J]. China Environmental Science, 2022,42(12): 5472-5483. [47] Kokkinos K, Karayannis V, Nathanail E, et al. A comparative analysis of Statistical and Computational Intelligence methodologies for the prediction of traffic-induced fine particulate matter and NO2[J]. Journal of cleaner production, 2021,328:129500. [48] Zhao J, Deng F, Cai Y, et al. Long short-term memory-Fully connected (LSTM-FC) neural network for PM2.5 concentration prediction[J]. Chemosphere, 2019,220:486-492. [49] Mao W, Jiao L, Wang W. Long time series ozone prediction in China: A novel dynamic spatiotemporal deep learning approach[J]. Building and environment, 2022,218:109087. [50] Valsecchi C, Grisoni F, Consonni V, et al. Consensus versus Individual QSARs in Classification: Comparison on a Large-Scale Case Study[J]. Journal of Chemical Information and Modeling, 2020,60(3):1215-1223. [51] Zhang J, Tan Z, Wei Y. An adaptive hybrid model for short term electricity price forecasting[J]. Applied Energy, 2020,258:114087. [52] Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2011,73(3):273-282. [53] 王静宇.基于LASSO的原油价格收益率预测的集成模型研究[D]. 成都:西南财经大学, 2021. Wang J Y. Research on integrated model of crude oil price yield forecast based on LASSO[D]. Chengdu: Southwest University of Finance and Economics, 2021. [54] Donnelly A, Misstear B, Broderick B. Real time air quality forecasting using integrated parametric and non-parametric regression techniques[J]. Atmospheric Environment, 2015,103:53-65. [55] 梁高亮,关远鹏.南海区空气中的O3和NO2浓度变化特征分析[J]. 环境科学与技术, 2013,36(S1):110-112. Liang G L, Guan Y P. Analysis of the concentration variation characteristic of O3 and NO2 in Nanhai District air[J]. Environmental Science & Technology, 2013,36(S1):110-112. [56] 樊建凌,胡正义.江西鹰潭地区森林生态系统NO2浓度变化规律[J]. 中国环境科学, 2006,26(2):171-175. Fan J L, Hu Z Y. Dynamics of atmospheric NO2 concentration in a forest eco-system at Yingtan, Jiangxi Province[J]. China Environmental Science, 2006,26(2):171-175. [57] Guo Q, He Z, Li S, et al. Air pollution forecasting using artificial and wavelet neural networks with meteorological conditions[J]. Aerosol and Air Quality Research, 2020,20(6):1429-1439. [58] Chi Y, Fan M, Zhao C, et al. Machine learning-based estimation of ground-level NO2 concentrations over China[J]. Science of The Total Environment, 2022,807:150721. [59] Xu J, Lindqvist H, Liu Q, et al. Estimating the spatial and temporal variability of the ground-level NO2 concentration in China during 2005~2019 based on satellite remote sensing[J]. Atmospheric Pollution Research, 2021,12(2):57-67. [60] 肖钟湧,谢先全,陈颖锋,等.粤港澳大湾区NO2污染的时空特征及影响因素分析[J]. 中国环境科学, 2020,40(5):2010-2017. Xiao Z Y, Xie X Q, Chen Y F, et al. Temporal and spatial characteristics and influencing factors of NO2 pollution over Guangdong-Hong Kong-Macao Greater Bay Area, China[J]. China Environmental Science, 2020,40(5):2010-2017. [61] 芦华,谢旻,吴钲,等.基于机器学习的成渝地区空气质量数值预报PM2.5订正方法研究[J]. 环境科学学报, 2020,40(12):4419-4431. Lu H, Xie M, Wu Z, et al. Adjusting PM2.5 prediction of the numerical air quality forecast model based on machine learning methods in Chengyu region[J]. Acta Scientiae Circumstantiae, 2020,40(12):4419-4431. [62] 廖启行.顺德工业产业结构优化研究[D]. 兰州:兰州大学, 2010. Liao Q X. The research of the optimization of Shunde's Industrial construction[D]. Lanzhou: Lanzhou University, 2010. [63] 佛山市生态环境局顺德分局.顺德区“十四五”环境空气质量达标规划(2021-2025)(送审稿)[EB/OL]. http://www.shunde.gov.cn/sdqsthj/tzggjdt/content/post_5447749.html 2022-11-15. Shunde Branch of Foshan Ecological Environment Bureau. Shunde District's 14th Five Year Plan for Environmental Air Quality Compliance (2021-2025) (Draft for Review)[EB/OL]. http://www.shunde.gov.cn/sdqsthj/tzggjdt/content/post_5447749.html 2022-11-15.