Abstract:Nowadays, aviation data show a high dimensional and massive trend, while the traditional models always lack computing resources. In order to solve this problem, a parallel flight delay prediction model considering meteorological data based on Spark was proposed in this paper. The DataFrame was used to complete the fusion of flight data and meteorological data, so as to add different hours of weather data to a single flight data. Then, the parallelization method was used to divide the characteristics of the random forest and generate the tree, thus the flight delay prediction can be carried out quickly. The experimental results show that the recall and the accuracy rate improve after integrating meteorological data. The prediction accuracy of large threshold is higher for predicting different delay time. At the same time, the parallelization model converges faster than the single machine model, and has stronger acceleration ratio.
吴仁彪,李佳怡,屈景怡. 融合气象数据的并行化航班延误预测模型[J]. 信号处理, 2018, 34(5): 505-512.
WU Ren-biao,LI Jia-yi,QU Jing-yi. Parallel Flight Delay Prediction Model Based on Fusion of Meteorological Data. Journal of Signal Processing, 2018, 34(5): 505-512.
Li Junhui, Zhu Jinfu, Chen Xin.Robust assignment model of airport gate based on flight delay distribution[J].Journal of Traffic and Transportation Engineering, 2014, 14(6):74-82
Cheng Hua, Li Yanmei, Luo Qian, et al.Study on flight delay with C45 decision tree based prediction method[J].System Engineering - Theory & Practice, 2014, 34(s1):239-247
Luo Yunqian, Chen Zhijie, Tang Jinhui, et al.Flight delay prediction using support vector machine regression[J].Journal of Transportation Systems Engineering and Information Technology, 2014, 15(1):143-149
Ding Jianli, Wang Man, Cao Weidong, et al.Immune prediction algorithm of flight delay for hub airport in different periods[J].Computer Engineering and Design, 2015, 36(4):1037-1041
[11]
Khanmohammadi S, Tutun S, Kucuk Y.A new multilevel input layer artificial neural network for predicting flight delays at JFK airport[J].Procedia Computer Science, 2016, 95:237-244
[12]
Rebollo J J, Balakrishnan H.Characterization and prediction of air traffic delays[J].Transportation Research Part C: Emerging Technologies, 2014, 44:231-241
Luo Qian, Zhang Yonghui, Cheng Hua, et al.Study on flight delay prediction model based on flight networks[J].Systems Engineering Theory & Practice, 2014, 34(s1):143-150
Liu Hongbo, Li Yu, Lin Wenjie, et al.SAR image segmentation with parallel MCMC algorithm[J].Journal of Signal Processing, 2016, 32(8):998-1006
[17]
Harnie D, Saey M, Vapirev A E, et al.Scaling machine learning for target prediction in drug discovery using apache spark[J].Future Generation Computer Systems, 2017, 67:409-417
Meng Jianliang, Liu Dechao.A new method for identifying bad data of power system based on Spark and clustering analysis[J].Power System Protection and Control, 2016, 44(3):85-91
Kim Y J, Choi S, Briceno S, et al.A deep learning approach to flight delay prediction[C]//IEEE. 2016 IEEE/AIAA 35th Digital Avionics Systems Conference. New York: IEEE, 2016: 1-6.
[22]
Zaharia M, Chowdhury M, Das T, et al.Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing[C]// Usenix Conference on Networked Systems Design and Implementation. Lombard: Usenix Association, 2012:2-2.