Taking the air quality and meteorological data of 2017 in Ganzhou as the research object, the optimal feature subset was extracted by the maximal relevance minimal redundancy algorithm (MRMR) and used as the input data of the prediction model. At the same time, the hybrid kernel (HK) was constructed to improve the traditional support vector machine model (SVM) and finally the MRMR-HK-SVM model was established. The experimental results show that the MRMR-HK-SVM model has a lower mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE), compared with the traditional SVM model, the mean absolute error of the prediction results decreased by 26.9%, and and it can track the sudden change of PM2.5 concentration more accurately. It can be seen that MRMR-HK-SVM model has better generalization ability and can more accurately predict PM2.5 concentration.
李建新, 刘小生, 刘静, 黄俊. 基于MRMR-HK-SVM模型的PM2.5浓度预测[J]. 中国环境科学, 2019, 39(6): 2304-2310.
LI Jian-xin, LIU Xiao-sheng, LIU Jing, HUANG Jun. Prediction of PM2.5 concentration based on MRMR-HK-SVM model. CHINA ENVIRONMENTAL SCIENCECE, 2019, 39(6): 2304-2310.
顾芳婷,胡敏,王渝,等.北京2009~2010年冬、春季PM2.5污染特征[J]. 中国环境科学, 2016,36(9):2578-2584.Gu F T, Hu M, Wang Y, et al. Characteristics of PM2.5 pollution winter and spring of Beijing during 2009~2010[J]. China Environment Science, 2016,36(9):2578-2584.
[2]
Zhang Y, Bocquet M, Mallet V, et al. Real-time air quality forecasting, part I:History, techniques, and current status[J]. Atmospheric Environment, 2012,60(60):632-655.
[3]
王平,张红,秦作栋,等.基于wavelet-SVM的PM10浓度时序数据预测[J]. 环境科学, 2017,38(8):3153-3161.Wang P, Zhang H, Qin Z Z, et al. PM10 concentration forecasting model based on Wavelet-SVM[J]. Environmental Science, 2017,38(8):3153-3161.
[4]
Perez P, Gramsch E. Forecasting hourly PM2.5 in Santiago de Chile with emphasis on night episodes[J]. Atmospheric Environment, 2015, 124:22-27.
[5]
Sun W, Sun J. Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm[J]. Journal of Environmental Management, 2017,188:144-152.
[6]
尹建光,彭飞,谢连科,等.基于小波分解与自适应多级残差修正的最小二乘支持向量回归预测模型的PM2.5浓度预测[J]. 环境科学学报, 2018,38(5):2090-2098.Yin J G, Peng F, Xie L K, et al. The study on the prediction of the PM2.5 concentration based on model of the least squares support vector regression under wavelet decomposition and adaptive multiple layer residuals correction[J]. Acta Scientiae Circumstantiae, 2018,38(5):2090-2098.
[7]
Niu M, Wang Y, Sun S, et al. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting[J]. Atmospheric Environment, 2016, 134:168-180.
[8]
Vapnik V N. The nature of statistical learning theory[M]. New York:Springer, 1995:156-160.
[9]
Suykens J A K. Support vector machines:a nonlinear modelling and control perspective[J]. European Journal of Control, 2001,7(2):311-327.
[10]
Ni X Y, Huang H, Du W P. Relevance analysis and short-term prediction of PM2.5 concentrations in Beijing based on multi-source data[J]. Atmospheric Environment, 2017,150:146-161.
[11]
宋国君,国潇丹,杨啸,等.沈阳市PM2.5浓度ARIMA-SVM组合预测研究[J]. 中国环境科学, 2018,38(11):4031-4039.Song G G, Guo X D, Yang X, et al. ARIMA-SVM combination prediction of PM2.5 concentration in Shenyang[J]. China Environment Science, 2018,38(11):4031-4039.
[12]
Wang P, Zhang G, Zhang H, et al. novel hybrid-Garch model based on Arima and SVM for PM2.5 concentrations forecasting[J]. Atmospheric Pollution Research, 2017,8(5):850-860.
[13]
秦喜文,刘媛媛,王新民,等.基于整体经验模态分解和支持向量回归的北京市PM2.5预测[J]. 吉林大学学报(地球科学版), 2016, 46(2):563-568.Qin X W, Liu Y Y, Wang X M, et al. PM2.5 Prediction of Beijing city based on ensemble empirical mode decomposition and support vector regression[J]. Journal of Jilin University (Earth Science Edition), 2016,46(2):563-568.
[14]
Niu M, Gan K, Sun S, et al. Application of decomposition-ensemble learning paradigm with phase space reconstruction for day-ahead PM2.5 concentration forecasting[J]. Journal of Environmental Management, 2017,196:110-118.
[15]
Dong M, Yang D, Kuang Y, et al. PM 2.5 concentration prediction using hidden semi-Markov model-based times series data mining[J]. Expert Systems with Applications, 2009,36(5):9046-9055.
[16]
王黎明,吴香华,赵天良,等.基于距离相关系数和支持向量机回归的PM2.5浓度滚动统计预报方案[J]. 环境科学学报, 2017,37(4):1268-1276.Wang L M, Wu X H, Zhao T L, et al. A scheme for rolling statistical forecasting of PM2.5 concentrations based on distance correlation coefficient and support vector regression[J]. Acta Scientiae Circumstantiae, 2017,37(4):1268-1276.
[17]
István Juhos, László Makra, Balázs Tóth. Forecasting of traffic origin NO and NO2 concentrations by Support Vector Machines and neural networks using Principal Component Analysis[J]. Simulation Modelling Practice & Theory, 2008,16(9):1488-1502.
[18]
王占山,李云婷,陈添,等.2013年北京市PM2.5的时空分布[J]. 地理学报, 2015,70(1):110-120.Wang Z S, Li Y T, Chen T, et al. Spatial-temporal characteristics of PM2.5 in Beijing in 2013[J]. Acta Geographica Sinica, 2015,70(1):110-120.
[19]
李盼池,许少华.支持向量机在模式识别中的核函数特性分析[J]. 计算机工程与设计, 2005,(2):302-304.Li P C, Xu S H. Support vector machine and kernel function characteristic analysis in pattern recognition[J]. Computer Engineering and Design, 2005,(2):302-304.
[20]
郑志成,徐卫亚,徐飞,等.基于混合核函数PSO-LSSVM的边坡变形预测[J]. 岩土力学, 2012,33(5):1421-1426.Zheng Z C, Xu W Y, Xu F, et al. Forecasting of slope displacement based on PSO-LSSVM with mixed kernel[J]. Rock and Soil Mechanics, 2012,33(5):1421-1426.
[21]
Zhong Z, Carr T R. Application of mixed kernels function (MKF) based support vector regression model (SVR) for CO2-Reservoir oil minimum miscibility pressure prediction[J]. Fuel, 2016,184:590-603.
[22]
Fei S. A hybrid model of EMD and multiple-kernel RVR algorithm for wind speed prediction[J]. International Journal of Electrical Power & Energy Systems, 2016,78:910-915.
[23]
Peng H, Long F, Ding C. Feature selection based on mutual information:criteria of max-dependency, max-relevance, and min-redundancy.[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2005,27(8):1226-1238.
[24]
张俐,王枞.基于最大相关最小冗余联合互信息的多标签特征选择算法[J]. 通信学报, 2018,39(5):111-122.Zhang L, Wang Z. Multi-label feature selection algorithm based on joint mutual information of max-relevance and min-redundancy[J]. Journal on Communications, 2018,39(5):111-122.
[25]
Ju Z, He J J. Prediction of lysine glutarylation sites by maximum relevance minimum redundancy feature selection[J]. Analytical Biochemistry, 2018,550:1-7.
[26]
Chen L, Pai T Y. Comparisons of GM (1,1) and BPNN for predicting hourly particulate matter in Dali area of Taichung City, Taiwan[J]. Atmospheric Pollution Research, 2015,6(4):572-580.
[27]
陈菊芬,李勇.基于多模态支持向量回归的PM2.5浓度预测[J]. 环境工程, 2019,37(1):122-126+34.Chen J F, Li Y. Forecasting of PM2.5 concentration based on multimodal support vector regression[J]. Environmental Engineering, 2019,37(1):122-126+34.
[28]
周广强,高伟,谷怡萱,等.WRF-Chem模式降水对上海PM2.5预报的影响[J]. 环境科学学报, 2017,37(12):4476-4482.Zhou G Q, Gao W, Gu Y X, et al. Impact of precipitation on Shanghai PM2.5 forecast using WRF-Chem[J]. Acta Scientiae Circumstantiae, 2017,37(12):4476-4482.