Prediction performance of risk management and control mode in regional sites based on decision tree
ZHU Wen-hui, WANG Xia-hui, YANG Xin-tong, HE Jun, LU Ran, ZHANG Zheng
Soil Environmental Protection Center, Research Center of Heavy Metal Pollution Prevention and Control, Chinese Academy for Environmental Planning, Beijing 100012, China
Abstract:In order to overcome the defects of low efficiency, high cost and lack of systematicness, which had been found in screening of traditional risk management and control mode in regional sites. Three decision tree (DT) algorithms, including Chi-squared Automatic Interaction Detector (CHAID), Exhaustive CHAID (E-CHAID), and Classification And Regression Tree (CART), were implemented to predict risk management and control mode in regional sites. Characteristic data set was constructed using existing risk management and control mode of regional sites and relevant regional attributes. The results showed that, DT-based mode could be used to predict risk management and control mode in regional sites. In the aspects of accuracy (ACC), precision (PRE), recall ratio (REC) and F1 value, the performance of CART-DT was superior to CHAID-DT and E-CHAID-DT. The overall optimization algorithm of CART might be more suitable for predicting risk management and control mode in regional sites. Regional protection goal (RPG), regional pollution type (RPT) and regional average production period (RAPP) were the three dominant factors of CART-DT output. Eleven input variables such as regional average annual wind speed (RAAWS), regional topography (RT) and regional land value-added potential (RLVP) were relatively important to CART-DT output. Regional population density (RPD), regional dominant industry risk (RDIS) and four other input variables had relatively low impact on CART-DT output.
朱文会, 王夏晖, 杨欣桐, 何俊, 卢然, 张筝. 基于决策树的区域地块风险管控模式预测[J]. 中国环境科学, 2021, 41(12): 5771-5778.
ZHU Wen-hui, WANG Xia-hui, YANG Xin-tong, HE Jun, LU Ran, ZHANG Zheng. Prediction performance of risk management and control mode in regional sites based on decision tree. CHINA ENVIRONMENTAL SCIENCECE, 2021, 41(12): 5771-5778.
陈卫平,谢天,李笑诺,等.欧美发达国家场地土壤污染防治技术体系概述[J]. 土壤学报, 2018,55(3):527-542. Chen W, Xie T, Li X, et al. Generalizaion of technical systems for soil pollution prevention and control in developed countries[J]. Acta Pedologica Sinica, 2018,55(3):527-542.
[2]
Bardos R P, Bone B D, Boyle R, et al. The rationale for simple approaches for sustainability assessment and management in contaminated land practice[J]. Science of the Total Environment, 2016,563-564:755-768.
[3]
李云祯,董荐,刘姝媛,等.基于风险管控思路的土壤污染防治研究与展望[J]. 生态环境学报, 2017,26(6):1075-1084. Li Y, Dong J, Liu S, et al. Prospect and research of soil pollution control based on risk management[J]. Ecology and Environmental Sciences, 2017,26(6):1075-1084.
[4]
Harclerode M A, Macbeth T W, Miller M E, et al. Early decision framework for integrating sustainable risk management for complex remediation sites: Drivers, barriers, and performance metrics[J]. Journal of Environmental Management, 2016,184(1):57-66.
[5]
赵苗苗,赵师成,张丽云,等.大数据在生态环境领域的应用进展与展望[J]. 应用生态学报, 2017,28(5):1727-1734. Zhao M, Zhao S, Zhang L, et al. Applications of eco-environmental big data: Progress and prospect[J]. Chinese Journal of Applied Ecology, 2017,28(5):1727-1734.
[6]
Chen S C, Liang Z Z, Webster R, et al. A high-resolution map of soil pH in China made by hybrid modelling of sparse soil data and environmental covariates and its implications for pollution[J]. Science of the Total Environmental, 2019,655:273-283.
[7]
Fazio M, Celesti A, Puliafito A, et al. Big data storage in the cloud for smart environment monitoring[J]. Procedia Computer Science, 2015,52:500-506.
[8]
Hengl T, Jesus J M D, Heuvelink G B M, et al. SoilGrids 250m: Global gridded soil information based on machine learning[J]. Plos One, 2017,12(2):1-40.
[9]
Jia X L, Hu B F, Marchant B P, et al. A methodological framework for identifying potential sources of soil heavy metal pollution based on machine learning: A case study in the Yangtze Delta, China[J]. Environmental Pollution, 2019,250:601-609.
[10]
Wang D, Liu J, Zhu A, et al. Automatic extraction and structuration of soil-environment relationship information from soil survey reports[J]. Journal of Integrative Agriculture, 2019,18(2):328-339.
[11]
Sajedi-Hosseini F, Malekian A, Choubin B, et al. A novel machine learning-based approach for the risk assessment of nitrate groundwater contamination[J]. Science of the Total Environment, 2018,644:954-962.
[12]
Wu J, Teng Y G, Chen H Y, et al. Machine-learning models for on-site estimation of background concentrations of arsenic in soils using soil formation factors[J]. Journal of Soils and Sediments, 2016,16:1787- 1797.
[13]
Ding W, Zhang J, Leung Y, et al. Prediction of air pollutant concentration based on sparse response back-propagation training feedforward neural networks[J]. Environmental Science & Pollution Research, 2016,23(19):1-14.
[14]
王志甄,邹志云.基于神经网络的pH中和过程非线性预测控制[J]. 化工学报, 2019,70(2):678-686. Wang Z, Zou Z. Nonlinear predictive control strategies of pH neutralization process based on neural networks[J]. CIESC Journal, 2019,70(2):678-686.
[15]
秦绪佳,彭洁,徐菲,等.基于RBF网络的城市垃圾产量预测及可视化[J]. 中国环境科学, 2018,38(2):792-800. Qin X, Peng J, Xu F, et al. Prediction and visualization of municipal solid waste production based on RBF network[J]. China Environmental Science, 2018,38(2):792-800.
[16]
王涛,王俊,赵迪宇,等.基于BP神经网络的玻璃纤维增强塑料腐蚀条件下的寿命预测[J]. 化工学报, 2019,70(12):4872-4880. Wang T, Wang J, Zhao D, et al. Life prediction of glass fiber reinforced plastics based on BP neural network under corrosion condition[J]. CIESC Journal, 2019,70(12):4872-4880.
[17]
Huysegoms L, Cappuyns V. Critical review of decision support tools for sustainability assessment of site remediation options[J]. Journal of Environmental Management, 2017,196:278-296.
[18]
张秋垒,黄国鑫,王夏晖,等.基于案例推理和机器学习的场地污染风险管控与修复方案推荐系统构建技术[J]. 环境工程技术学报, 2020,10(6):1012-1021. Zhang Q, Huang G, Wang X, et al. Construction technology for site pollution risk control and remediation scheme recommendation system supported by case-based reasoning and machine learning[J]. Journal of Environmental Engineering Technology, 2020,10(6):1012-1021.
[19]
刘爽,张笑,赵文吉,等.基于DEM的山区冬季燃煤污染物排放遥感测算-以北京市门头沟区为例[J]. 中国环境科学, 2019,39(10): 4270-4278. Liu S, Zhang X, Zhao W, et al. Estimation of coal-burning contamination emissions in mountain areas during winter season based on DEM: A case study of Mentougou, Beijing[J]. China Environmental Science, 2019,39(10):4270-4278.
[20]
吕利利,颉耀文,黄晓君,等.基于CART决策树分类的沙漠化信息提取方法研究[J]. 遥感技术与应用, 2017,32(3):499-506. Lv L, Xie Y, Huang X, et al. Desertification information extraction method research based on the CART decision tree classification[J]. Remote Sensing Technology and Application, 2017,32(3):499-506.
[21]
董红召,许慧鹏,卢滨,等.城市交通道路氮氧化物浓度的CART回归树预测研究[J]. 环境科学学报, 2019,39(4):1086-1094. Dong H, Xu H, Lu B, et al. A CART-based approach to predict nitrogen oxide concentration along urban traffic roads[J]. Acta Scientiae Circumstantiae, 2019,39(4):1086-1094.
[22]
丁愫,陈报章,王瑾,等.基于决策树的统计预报模型在臭氧浓度时空分布预测中的应用[J]. 环境科学学报, 2018,38(8):3229-3242. Ding S, Chen B, Wang J, et al. An applied research of decision-tree based statistical model in forecasting the spatial-temporal distribution of O3[J]. Acta Scientiae Circumstantiae, 2018,38(8):3229-3242.
[23]
张秀英,孙棋,王珂,等.基于决策树的土壤Zn含量预测[J]. 环境科学, 2008,29(12):3508-3512. Zhang X, Sun Q, Wang K, et al. Assessing soil Zn content using decision tree analysis[J]. Environmental Science, 2008,29(12):3508- 3512.
[24]
Yoo K, Shukla S K, Ahn J J, et al. Decision tree-based data mining and rule induction for identifying hydrogeological parameters that influence groundwater pollution sensitivity[J]. Journal of Cleaner Production, 2016,122:277-286.
[25]
仝桂杰,吴绍华,袁毓婕,等.基于贝叶斯决策树的小麦镉风险识别规则提取[J]. 中国环境科学, 2019,39(3):1336-1344. Tong G, Wu S, Yuan Y, et al. Identification rules of wheat Cd risk based on Bayesian decision tree[J]. China Environmental Science, 2019,39(3):1336-1344.
[26]
张亮,宁芊.CART决策树的两种改进及应用[J]. 计算机工程与设计, 2015,36(5):1209-1213. Zhang L, Ning Q. Two improvements on CART decision tree and its application[J]. Computer Engineering and Design, 2015,36(5):1209- 1213.
[27]
Zhang X Y, Lin F F, Jiang Y G, et al. Assessing soil Cu content and anthropogenic influences using decision tree analysis[J]. Environmental Pollution, 2008,156(3):1260-1267.
[28]
Kim K, Yoo K, Ki D, et al. Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity[J]. Environmental Monitoring and Assessment, 2011,178:595-610.
[29]
Vega F A, Matías J M, Andrade M L, et al. Classification and regression trees (CARTs) for modelling the sorption and retention of heavy metals by soil[J]. Journal of Hazardous Materials, 2009,167 (1-3):615-624.
[30]
Isazadeh A, Mahan F, Pedrycz W. MFlexDT: multi flexible fuzzy decision tree for data stream classification[J]. Soft Computing, 2016, 20:3719-3733.