Comparison of estimation models for near-surface ozone concentration based on gradient boosting algorithm
LIANG Xiao-xia1,2, XIE Dong-hai1, HAN Zong-fu3, SONG Shi-peng2, ZHANG Xin-xin4, GU Jian-bin2, YU Chao2
1. College of Resource Environment and Tourism, Capital Normal University, Beijing 100089, China; 2. State Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China; 3. School of Computing, Beijing University of Posts and Telecommunications, Beijing 100876, China; 4. School of Electrical Engineering, Nantong University, Nantong 226019, China
Abstract:This paper proposes an improved model for estimating the temporal and spatial distribution of near-surface ozone concentrations based on the gradient boosting tree optimization algorithm. First, in-situ near-surface O3 concentrations data, high-resolution satellite observations of atmospheric compositions (TROPOMI, AIRS), ERA5meteorological reanalysis data, and land cover and terrain data were used to investigate thecorrelations among O3, tropospheric O3, O3 precursors, meteorological factors and underlying surfaces in the Beijing-Tianjin-Hebei region. Then, estimation models for near-surface O3 concentrations based on three different gradient boosting tree algorithm models (GBDT, XGBoost, LightGBM) were developed, and the estimation accuracies of different models were compared and analyzed. Results showed that all three models could accurately estimate near-surface ozone concentrations, with the XGBoost model demonstrating the highest accuracy. The coefficients of determination R2 of GBDT, XGBoost, and LightGBM were 0.9489, 0.9547, and 0.9495 respectively, and the root mean square errors RMSE were 13.85μg/m3, 13.26μg/m3 and 13.76μg/m3 respectively. Using filtering methods, correlation analysis, and recursive feature elimination, the XGBoost estimation model was optimized, feature complexity was reduced while maintaining model accuracy. After optimization, the estimation accuracy of the optimized model reached R2=0.9549, increasing the estimation efficiency by about 17%. This paper provided a sophisticated and efficient method for modeling and estimating the temporal and spatial distribution of ozone concentration on a regional scale.
周骥,付世华,彭丽,等.臭氧和PM2.5对慢阻肺死亡影响及气温修饰效应[J].中国环境科学, 2021,41(12):5904-5911. Zhou J, Fu S H, Peng L, et al. Effects of ozone and PM2.5 on COPD deaths and the effect modification by the air temperature[J]. China Environmental Science, 2021,41(12):5904-5911.
[2]
陈传忠,张鹏,于勇,等.生态环境监测发展历程与展望--从"跟跑""并跑"向"领跑"迈进[J].环境保护, 2022,50(Z2):25-28. Chen C Z,Zhang P,Yu Y,et al. Development history an future prospects of eco-environment monitoring--from"following" and"running" to"leading"[J]. environmental protection, 2022,50(Z2):25-28.
[3]
陈良富,王雅鹏,张欣欣,等.面向区域二次污染风险控制的臭氧及其前体物卫星遥感监测[J].环境监控与预警, 2019,11(5):13-21. Chen L F,Wang Y P,Zhang X X,et al. Satellite remote sensing monitoring of ozone and its precursors for regional secondary pollution risk control[J]. Environmental monitoring and early warning, 2019,11(5):13-21.
[4]
沈培福,靳全锋,周雨欣,等.浙江省O3浓度时空格局及驱动因子[J].环境科学研究, 2022,35(9):2136-2146. Shen P F,Jin Q F,Zhou Y X,et al. Spatial-temporal pattern and driving factors of surface ozone concentrations in Zhejiang province[J]. Environmental scientific research, 2022,35(9):2136-2146.
[5]
杨喆,赵锦慧,刘玉青,等.武汉市臭氧时空分布及环境变量影响分析[J].湖北大学学报(自然科学版), 2021,43(5):522-528. Yang Z, Zhao J H, Liu Y Q, et al. Analysis of the temporal and spatial distribution of ozone and the influence of environmental variables in Wuhan[J]. Journal of Hubei University (Natural Science Edition), 2021,43(5):522-528.
[6]
冯春莉,李润奎.基于土地利用回归模型的北京市2013~2019年大气污染时空变化分析[J].环境科学学报, 2021,41(4):1231-1238. Feng C L, Li R K. Spatiotemporal variation analysis of air pollution from 2013 to 2019 in Beijing based on land use regression model[J]. Journal of Environmental Sciences, 2021,41(4):1231-1238.
[7]
李紫微.京津唐地区近地面臭氧反演与时空分布特征分析[D].西安:西安科技大学, 2021. Li Z W.Analysis of spatial and temporal distribution characteristics of near-surface ozone in Beijing, Tianjin and Tangshan[D]. Xi'an:Xi'an University of Science and Technology, 2021.
[8]
Su X Q, An J L, Zhang Y X, et al. Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods[J]. Atmospheric Pollution Research, 2020,11(6):51-60.
[9]
翟维枫,黄理邮,孙德辉,等.基于BP神经网络的臭氧浓度预测模型研究[J].工业控制计算机, 2020,33(7):12-13,16. Zhai W F, Huang L Y, Sun D H, et al. Prediction model of ozone concentration based on BP neural network[J]. industrial control computer, 2020,33(7):12-13,16.
[10]
方韬.基于神经网络的近地面臭氧估算和预测研究[D].上海:上海师范大学, 2020. Fang T.Estimation and prediction of near-surface ozone based on neural networks[D]. Shanghai:Shanghai Normal University, 2020.
[11]
丁愫,陈报章,王瑾,等.基于决策树的统计预报模型在臭氧浓度时空分布预测中的应用研究[J].环境科学学报, 2018,38(8):3229-3242. Ding S, Chen B Z, Wang J, et al. Huang C W. Application of statistical prediction model based on decision tree to forecast the spatial and temporal distribution of ozone concentration[J]. Journal of Environmental Science, 2018,38(8):3229-3242.
[12]
Zhan Y, Luo Y Z, Deng X F, et al. Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment[J]. Environmental Pollution, 2018,233:464-473.
[13]
马润美,张亚一,班婕,等.基于随机森林模型的京津冀地区近地面臭氧站点浓度预测[J].环境与健康杂志, 2019,36(11):954-957. Ma R M, Zhang Y Y, Ban J, et al. Prediction of near-surface ozone concentration based on random forest model in Beijing-Tianjin-Hebei Region[J]. Journal of Environment and Health, 2019,36(11):954-957.
[14]
李一蜚,秦凯,李丁,等.基于梯度提升回归树算法的地面臭氧浓度估算[J].中国环境科学, 2020,40(3):997-1007. Li Y F, Qin K, Li D, et al. Estimation of ground ozone concentration based on gradient lifting regression tree algorithm[J]. Chinese Environmental Science, 2020,40(3):997-1007.
[15]
朱媛媛,刘冰,桂海林,等.京津冀臭氧污染特征、气象影响及基于神经网络的预报效果评估[J].环境科学, 2022,43(8):3966-3976. Zhu Y Y, Liu B, Gui H L, et al. Characteristics of ozone pollution in Beijing-Tianjin-Hebei, meteorological impact and evaluation of prediction effect based on neural network[J]. environmental sciences, 2022,43(8):3966-3976.
[16]
Zhu S Y,Xu J,Yu C,et al. Learning Surface Ozone From Satellite Columns (LESO):A Regional Daily Estimation Framework for Surface Ozone Monitoring in China[J]. Ieee Transactions on Geoscience and Remote Sensing, 2022,60.
[17]
Wang H M, Wang Y P, Cai K, et al. Evaluating the Performance of Ozone Products Derived from CrIS/NOAA20, AIRS/Aqua and ERA5Reanalysis in the Polar Regions in 2020 Using Ground-Based Observations[J]. Remote Sensing, 2021,13.
[18]
康晓伟,冯钟葵.ASTER GDEM数据介绍与程序读取[J].遥感信息, 2011,(6):69-72. Kang X W, Feng Z K. ASTER GDEM data introduction and program reading[J]. remote sensing information, 2011,(6):69-72.
[19]
姚青,马志强,郝天依,等.京津冀区域臭氧时空分布特征及其背景浓度估算[J].中国环境科学, 2021,41(11):4999-5008. Yao Q, Ma Z Q, Hao T Y, et al. Spatio-temporal distribution and background concentration estimation of ozone in Beijing-Tianjin-Hebei region[J]. China Environmental Sciences, 2021,41(11):4999-5008.
[20]
Ma Z Q, Xu J, Quan W J, et al. Significant increase of surface ozone at a rural site, north of eastern China[J]. Atmospheric Chemistry and Physics, 2016,16(6):3969-3977.