Estimation of the column concentration of carbon dioxide using spaceborne shortwave infrared spectrometer
LI Jing-bo1, ZHANG Ying2, GAI Rong-li1
1. School of Information Engineering, Dalian University, Dalian 116622, China; 2. State Key Laboratory of Remote Sensing Science, Aerospace Information Innovation Research Institute, Chinese Academy of Sciences, Beijing 100101, China
Abstract:In this paper we used OCO-2 satellite observations, TCCON (total carbon column observing network, TCCON) observations, NDVI (Normalized Difference Vegetation Index) records, and atmospheric parameters from ERA5, and apply various machine learning models including decision tree and ensemble learning (e.g. XGBoost, random forest, extreme random forest, gradient lifting) to predict the carbon dioxide column concentration. Models are established to predict the concentration of carbon dioxide column through correlation analysis, feature selection and extraction. Model prediction results are then compared with the data from XCO2 observations from TCCON. By comparing the prediction results from different models, it has been found that the extreme random forest regression model has the best performance in predicting CO2 column concentration. R2, RMSE (root mean square error), MAE (mean absolute error) and MRE (mean relative error) of Extra Trees Regressor (extreme random forest regression model) are 0.953, 0.492×10-6, 0.260×10-6 and 0.063% respectively. Thereafter, the prediction performance of ExtraTreesRegressor is analysed by changing the input parameters. The results show that within the acceptable error range (±2x10-6), the prediction accuracy of extreme random forest regression model and gradient lifting regression model is the same, which is 98.10%. Because the variation range of CO2 column concentration is small, it is necessary to further narrow the error range. Within the error range of ±1×10-6, the prediction accuracy of extreme random forest regression model and gradient lifting regression model are 91.82% and 90.51% respectively. Therefore, the extreme random forest algorithm results in better accuracy in predicting CO2 column concentration, which meets the accuracy requirements of CO2 prediction.
李静波, 张莹, 盖荣丽. 基于机器学习的星载短波红外CO2柱浓度估算[J]. 中国环境科学, 2023, 43(4): 1499-1509.
LI Jing-bo, ZHANG Ying, GAI Rong-li. Estimation of the column concentration of carbon dioxide using spaceborne shortwave infrared spectrometer. CHINA ENVIRONMENTAL SCIENCECE, 2023, 43(4): 1499-1509.
吴长江,雷莉萍,曾招城.不同卫星反演的大气CO2浓度差异时空特征分析[J]. 中国科学院大学学报, 2019,36(3):331-337. Wu C J, Lei L P, Zeng Z C. Spatio-temporal analysis of differences among atmospheric CO2 concentrations retrieved from different satellite observations[J]. Journal of University of Chinese Academy of Sciences, 2019,36(3):331-337.
[2]
梁艾琳.星载遥感二氧化碳的验证、反演及应用[D]. 武汉:武汉大学, 2018. Liang A L. Research of space-borne remote sensing for carbon dioxide on validation,inversion and application[D]. Wuhan:Wuhan University, 2018.
[3]
赵 靓.基于GOSAT卫星的大气CO2和CH4遥感反演研究[D]. 长春:吉林大学, 2017. Zhao L. Remote retrieval of atmospheric CO2 and CH4 using GOSAT[D]. Changchun:Jilin University, 2017.
[4]
叶函函,王先华,吴时超,等.高分五号卫星GMI大气CO2反演方法[J]. 大气与环境光学学报, 2021,16(3):231-238. Ye H H, Wang X H, Wu S C, et al. Atmospheric CO2 retrieval method for satellite observations of greenhouse gases monitoring instrument on GF-5[J]. Journal of Atmospheric and Environmental Optics, 2021, 16(3):231-238.
[5]
张炳炎,闫召爱,郭文杰,等.基于CO2测量数据的大气辐射传输模型LBLRTM优化[J]. 空间科学学报, 2021,41(6):905-910. Zhang B Y, Yan Z A, Guo W J, et al. Optimization of atmospheric radiative transfer model LBLRTM based on measured CO2 data[J]. Chinese Journal of Space Science, 2021,41(6):905-910.
[6]
郑景治.大气CO2的时序数据处理及预测模型的研究[D]. 淮南:安徽理工大学, 2020. Zheng J Z. Research on time series data processing and prediction model of atmospheric CO2[D]. Huainan:Anhui University of Science & Technology, 2020.
[7]
Meng J, Ding G Y, Liu L Y. Research on a prediction method for carbon dioxide concentration based on an optimized LSTM network of spatio-temporal data fusion[J]. IEICE Transactions on Information and Systems, 2021,104(10):1753-1757.
[8]
Wunch D, Toon G C, Sherlock V, et al. The total carbon column observing network's GGG2014 data version[J]. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA, available at:doi, 2015,10.
[9]
Wunch D, Toon G C, Blavier J F L, et al. The total carbon column observing network[J]. Philosophical Transactions of the Royal Society A:Mathematical, Physical and Engineering Sciences, 2011,369 (1943):2087-2112.
[10]
Crisp D, Pollock H R, Rosenberg R, et al. The on-orbit performance of the Orbiting Carbon Observatory-2(OCO-2) instrument and its radiometrically calibrated products[J]. Atmospheric Measurement Techniques, 2017,10(1):59-81.
[11]
李忠宾,王 楠,张自力,等.中国地区MODIS气溶胶光学厚度产品综合验证及分析[J]. 中国环境科学, 2020,40(10):4190-4204. Li Z B, Wang N, Zhang Z L, et al. Validation and analyzation of MODIS aerosol optical depth products over China[J]. China Environmental Science, 2020,40(10):4190-4204.
[12]
李镜尧,施润和,殷瑞娟.基于RTTOV模式的大气二氧化碳反演参数敏感性分析[J]. 地球信息科学学报, 2014,16(3):443-449. Li J Y, Shi R H, Yin R J. Sensitivity analysis of atmospheric carbon dioxide retrieval parameters based on RTTOV[J]. Journal of Geo-Information Science, 2014,16(3):443-449.
[13]
马志泉,陈钦明,高德政.用中国地区ERA-Interim资料计算ZTD和ZWD的精度分析[J]. 大地测量与地球动力学, 2012,32(2):100-104. Ma Z Q, Chen Q M, Gao D Z. Study on accuracy of ztd and zwd calculated from era-interim data over China[J]. Journal of Geodesy and Geodynamics, 2012,32(2):100-104.
[14]
Hersbach H, Dee D. ERA5reanalysis is in production[J]. ECMWF Newsletter, 2016,147(7):1201-1209.
[15]
张俊兵,沈润平,师春香,等.中国大陆地区ERA5下行短波辐射数据适用性评估与对比[J]. 地球信息科学学报, 2021,23(12):2261-2274. Zhang J B, Shen R P, Shi C X, et al. Evaluation and comparison of downward solar radiation from new generation atmospheric reanalysis ERA5across mainland China[J]. Journal of Geo-Information Science, 2021,23(12):2261-2274.
[16]
崔 婷.作物归一化植被指数(NDVI)变化规律试验研究[D]. 杨凌:西北农林科技大学, 2019. Cui T. Study on the variation of crop normalized difference vegetation index[D]. Yangling:Northwest A & F University, 2019.
[17]
张 亮,蒋 军.基于MODIS-NDVI的地表植被时空变化特征及其与环境因子的关系[J]. 安徽农业科学, 2022,50(4):57-63. Zhang L, Jiang J. Temporal and spatial variation characteristics of surface vegetation and its relationship with environmental factors based on MODIS-NDVI[J]. J.Anhui Agric. Sci., 2022,50(4):57-63.
[18]
王 宏,张 强,王 颖,等.基于ELM的改进CART决策树回归算法[J]. 计算机系统应用, 2021,30(2):201-206. Wang H, Zhang Q, Wang Y, et al. Improved CART decision tree regression algorithm based on ELM[J]. Computer Systems & Applications, 2021,30(2):201-206.
[19]
曹立源,范勤勤,黄敬英.基于特征选择和XGBoost优化的术中低体温预测[J]. 数据采集与处理, 2022,37(1):134-146. Cao L Y, Fan Q Q, Huang J Y. Intraoperative hypothermia prediction model based on feature selection and XGBoost optimization[J]. Journal of Data Acquisition and Processing, 2022,37(1):134-146.
[20]
李少亭,王雪瑞.XGBoost模型在新冠疫情预测中的研究应用[J]. 小型微型计算机系统, 2021,42(12):2465-2472. Li S T, Wang X R. Research and application of XGBoost in prediction of novel coronavirus epidemic[J]. Journal of Chinese Computer Systems, 2021,42(12):2465-2472.
[21]
谭建林.基于机器学习组合模型的上海市PM2.5浓度变化短期预测研究[D]. 赣州:江西理工大学, 2021. Tan J L. Short-term prediction of PM2.5 concentration changes based on machine learning combined model-A case study of Shanghai[D]. Ganzhou:Jiangxi University of Science and Technology, 2021.
[22]
王可心,包云轩,朱承瑛,等.随机森林回归法在冬季路面温度预报中的应用[J]. 气象, 2021,47(1):82-93. Wang K X, Bao Y X, Zhu C Y, et al. Forecasts of road surface temperature in winter based on random forests regression[J]. Meteorological Monthly, 2021,47(1):82-93.
[23]
刘云翔,陈 斌,周子宜.一种基于随机森林的改进特征筛选算法[J]. 现代电子技术, 2019,42(12):117-121. Liu Y X, Chen B, Zhou Z Y. An improved feature selection algorithm based on random forest[J]. Modern Electronics Technique, 2019, 42(12):117-121.
[24]
韦良芳.基于极端随机树与Logistic回归算法的网贷平台个人信用评估模型的比较研究[D]. 济南:山东大学, 2020. Wei L F.A comparative study of personal credit evaluation model on net lending platforms based on extremely randomized trees with logistic regression algorithm[D]. Jinan:Shandong University, 2020.
[25]
程 渊,李玉霞,李 凡,等.基于极端随机树的闪电河流域土壤水分反演[J]. 遥感学报, 2021,25(4):941-951. Cheng Y, Li Y X, Li F, et al. Soil moisture retrieval using extremely randomized trees over the Shandian River Basin[J]. National Remote Sensing Bulletin, 2021,25(4):941-951.
[26]
陈雨桐.集成学习算法之随机森林与梯度提升决策树的分析比较[J]. 电脑知识与技术, 2021,17(15):32-34. Chen Y T. Analysis and comparison of random forest and gradient boosted decision tree of integrated learning algorithm[J]. Computer Knowledge and Technology, 2021,17(15):32-34.
[27]
张 菊,房世波,刘汉湖.基于微波数据与光学数据集成的机器学习技术在作物产量估算中的应用[J]. 地球信息科学学报, 2021,23(6):1082-1091. Zhang J, Fang S B, Liu H H. Machine learning approach for estimation of crop yield combining use of optical and microwave remote sensing data[J]. Journal of Geo-Information Science, 2021, 23(6):1082-1091.