紀(jì)麗娜 陳凱 于彥偉 宋鵬 王淑瑩 王成銳
摘 要:實(shí)時(shí)城市交通監(jiān)控已成為現(xiàn)代城市管理的一個(gè)重要組成部分,視頻監(jiān)控采集的交通大數(shù)據(jù)在城市管理和交通控制方面得到了越來(lái)越多的應(yīng)用;然而,全城范圍內(nèi)龐大的監(jiān)控交通大數(shù)據(jù)還鮮少用于城市交通及城市計(jì)算研究。在一個(gè)省會(huì)城市全城范圍內(nèi)的監(jiān)控交通大數(shù)據(jù)上展開(kāi)了車(chē)輛類(lèi)別挖掘及應(yīng)用分析研究。首先,定義了周期性私家車(chē)、類(lèi)出租車(chē)和公共通勤車(chē)三種對(duì)城市交通具有重要影響的車(chē)輛類(lèi)別,將車(chē)輛類(lèi)別定義與頻繁序列模式挖掘算法相結(jié)合提出了相應(yīng)的挖掘方法。在濟(jì)南市一周1704個(gè)視頻監(jiān)測(cè)點(diǎn),1.2億次車(chē)輛記錄數(shù)據(jù)上,驗(yàn)證了所提定義及挖掘方法的有效性;其次,以4個(gè)居民小區(qū)為例挖掘分析了居民出行的交通方式及與周?chē)d趣點(diǎn)(POI)分布關(guān)系,此外,還探索了城市交通大數(shù)據(jù)與POI相結(jié)合在城市規(guī)劃、需求預(yù)測(cè)和偏好推薦方面的應(yīng)用潛能。
關(guān)鍵詞:數(shù)據(jù)挖掘;交通大數(shù)據(jù);車(chē)輛類(lèi)別;交通方式;興趣點(diǎn)
中圖分類(lèi)號(hào):TP274
文獻(xiàn)標(biāo)志碼:A
Abstract: Realtime urban traffic monitoring has become an important part of modern urban management, and traffic big data collected by video monitoring is wildly applied to urban management and traffic control. However, such huge citywide monitoring traffic big data is rarely used for urban traffic and urban computing research. The vehicle type mining and application analysis were implemented on the citywide monitoring traffic big data of a provincial capital city. Firstly, three types of vehicles with important influence on urban traffic: periodic private car, taxi and public commuter bus were defined. And the corresponding mining method for each type of vehicles was proposed. Experiments on 120 million vehicle records collected from 1704 video monitoring points in Jinan demonstrated the effectiveness of the proposed definitions and mining methods. Secondly, with four communities as examples, the residents traffic modes and the relationships between the modes and the distribution of surrounding Points of Interest (POI) were mined and analyzed. Moreover, the potential applications of the urban traffic big data incorporated with POI in urban planning, demand forecasting and preference recommendation were explored.
英文關(guān)鍵詞Key words: data mining; traffic big data; vehicle type; traffic mode; Point of Interest (POI)
0 引言
實(shí)時(shí)交通監(jiān)控是現(xiàn)代城市管理中一項(xiàng)重要任務(wù),它有助于理解城市范圍內(nèi)行駛車(chē)輛、人員、公共交通的實(shí)時(shí)運(yùn)行狀態(tài)。這對(duì)智能交通系統(tǒng)、公共安全、交通調(diào)度與控制、城市計(jì)算等各類(lèi)城市應(yīng)用具有重要價(jià)值[1]。近年來(lái),視頻監(jiān)控被廣泛應(yīng)用于城市交通管理,尤其是在我國(guó)快速城鎮(zhèn)化建設(shè)進(jìn)程中,各大小城市基本完成了對(duì)主干道路的視頻交通監(jiān)控部署。一般情況下,視頻監(jiān)控部署在城市的重要交通路口,如圖1所示,在進(jìn)入路口的每個(gè)方向上,都有一組高清攝像頭部署在一條水平橫杠上,用于監(jiān)測(cè)進(jìn)入路口的每個(gè)車(chē)道上的行駛車(chē)輛。高清攝像頭結(jié)合主控機(jī)以及道路地面虛擬線(xiàn)圈或地埋線(xiàn)圈實(shí)現(xiàn)對(duì)通過(guò)車(chē)輛的檢測(cè)與抓拍。隨著人工智能技術(shù)的發(fā)展,現(xiàn)有的交通監(jiān)控系統(tǒng)不僅實(shí)現(xiàn)了通過(guò)車(chē)輛的監(jiān)測(cè)與追蹤,還可有效檢測(cè)車(chē)輛速度、行駛方向、識(shí)別車(chē)牌號(hào)碼、車(chē)輛類(lèi)型、車(chē)輛顏色、車(chē)輛品牌等豐富的外圍信息?;谶@些監(jiān)測(cè)數(shù)據(jù),很多交通違規(guī)行為可被自動(dòng)識(shí)別而無(wú)需人員干涉,例如闖紅燈、超速駕駛等。交通堵塞或交通事故也可在視頻監(jiān)控中被實(shí)時(shí)發(fā)現(xiàn),進(jìn)而用于疏導(dǎo)行人或車(chē)輛的行駛路線(xiàn)以防止交通狀況的進(jìn)一步惡化。此外,視頻監(jiān)控道路上的車(chē)流量很容易被統(tǒng)計(jì)出來(lái),這些信息對(duì)于交通擁堵預(yù)測(cè)、城市規(guī)劃、交通控制、甚至空氣污染評(píng)估[2]等各類(lèi)應(yīng)用研究至關(guān)重要。
在國(guó)內(nèi)外,已有大量城市交通大數(shù)據(jù)研究的相關(guān)工作[3-5],也有多個(gè)真實(shí)的城市車(chē)輛軌跡數(shù)據(jù)采集系統(tǒng),例如:微軟亞洲研究院的TDrive項(xiàng)目[6-7]在北京采集了3萬(wàn)多輛出租車(chē)三個(gè)月的全球定位系統(tǒng)(Global Positioning System, GPS)軌跡數(shù)據(jù);葡萄牙波爾圖采集了442輛出租車(chē)在2011年8月至2012年4月共9個(gè)月的車(chē)輛軌跡數(shù)據(jù)[8-9];美國(guó)紐約和芝加哥公開(kāi)了每年所有出租車(chē)輛每次載客的起始位置數(shù)據(jù)[10]。最近,國(guó)內(nèi)網(wǎng)約車(chē)行業(yè),如滴滴出行,也對(duì)出租車(chē)或網(wǎng)約車(chē)等城市交通數(shù)據(jù)展開(kāi)了研究分析[11], 但大多數(shù)城市交通數(shù)據(jù)及相關(guān)研究都是基于出租車(chē)數(shù)據(jù)展開(kāi),而出租車(chē)數(shù)據(jù)僅是城市交通數(shù)據(jù)中的一小部分,并且是對(duì)全城交通狀況的一個(gè)偏差采樣,缺少對(duì)全城范圍內(nèi)交通特征的體現(xiàn)[12], 這是由于出租車(chē)往往傾向于避開(kāi)交通擁堵路段和高峰擁堵時(shí)間[13]。
最近,在貴陽(yáng)包含155條道路的交通車(chē)流量數(shù)據(jù)被采集,該數(shù)據(jù)采集方式采用地埋線(xiàn)圈方式,僅能獲取到通過(guò)每條道路的車(chē)輛數(shù)量,相比視頻監(jiān)控交通數(shù)據(jù),該采集數(shù)據(jù)不僅數(shù)據(jù)規(guī)模較小,還缺少大量豐富的外圍信息。文獻(xiàn)[14]雖然使用了北京1040個(gè)攝像頭產(chǎn)生的車(chē)牌識(shí)別數(shù)據(jù),但也僅用于發(fā)現(xiàn)車(chē)流數(shù)據(jù)中車(chē)輛伴隨模式信息。
在我國(guó)城市視頻監(jiān)控交通系統(tǒng)中,主干道和重要交通路口基本都已經(jīng)被覆蓋,例如,在濟(jì)南,有近2000多組高清攝像頭監(jiān)控部署在1014個(gè)交通路口,覆蓋了2010條道路。每天監(jiān)測(cè)到上百萬(wàn)車(chē)輛的行駛路線(xiàn)。然而,如此龐大的監(jiān)控系統(tǒng)以及海量的全城交通車(chē)輛數(shù)據(jù)卻鮮少用于城市交通及城市計(jì)算相關(guān)研究。
本文在濟(jì)南市2016年8月收集的一周的全城視頻監(jiān)控交通數(shù)據(jù)上進(jìn)行了挖掘分析,該數(shù)據(jù)包括了1億多條車(chē)輛記錄和400多萬(wàn)輛車(chē)。
首先,研究了全城范圍內(nèi)交通車(chē)輛的類(lèi)別,定義了周期性私家車(chē)、類(lèi)出租車(chē)、公共通勤車(chē)三類(lèi)對(duì)城市交通具有重要影響的車(chē)輛類(lèi)別。根據(jù)定義,給出了三種車(chē)輛類(lèi)別的挖掘方法,并對(duì)挖掘結(jié)果進(jìn)行了驗(yàn)證與分析。根據(jù)挖掘結(jié)果,分析三類(lèi)車(chē)輛類(lèi)別對(duì)高峰期城市交通的影響,以及車(chē)輛類(lèi)別挖掘?qū)μ嵘悄芙煌ㄏ到y(tǒng)的作用; 其次,結(jié)合興趣點(diǎn)(Point of Interest, POI),以居民小區(qū)為例,在城市交通大數(shù)據(jù)上,通過(guò)案例挖掘分析居民出行的交通方式,以及與周?chē)鶳OI分布的關(guān)系,探索了城市交通大數(shù)據(jù)與POI相結(jié)合在城市規(guī)劃、需求預(yù)測(cè)、偏好推薦方面的應(yīng)用潛能; 最后總結(jié)了全文,并對(duì)下一步工作進(jìn)行了展望。
4 結(jié)語(yǔ)
本文完成了對(duì)濟(jì)南市全城范圍內(nèi)交通大數(shù)據(jù)的挖掘分析,首先,定義了城市交通中具有重要影響的周期性私家車(chē)、類(lèi)出租車(chē)、公共通勤車(chē)三種車(chē)輛類(lèi)別,并在真實(shí)數(shù)據(jù)上進(jìn)行了挖掘分析與驗(yàn)證,挖掘結(jié)果驗(yàn)證了所定義模型及挖掘算法的有效性。然后,以居民小區(qū)為例,分析了幾個(gè)案例小區(qū)居民的出行交通方式,以及與附近POI的關(guān)系。最后,探索了視頻監(jiān)控交通大數(shù)據(jù)與POI深度結(jié)合可能具有重要研究?jī)r(jià)值的潛在應(yīng)用方向。
下一步,將在城市交通大數(shù)據(jù)的語(yǔ)義匹配方面展開(kāi)深入研究,例如實(shí)現(xiàn)居住小區(qū)的精確匹配、目的地POI匹配、相關(guān)活動(dòng)匹配等。此外,還計(jì)劃對(duì)全城范圍內(nèi)的城市交通狀況(例如,交通流量與速度)的推理與預(yù)測(cè)、車(chē)輛路線(xiàn)的目的地預(yù)測(cè)展開(kāi)研究。
參考文獻(xiàn) (References)
[1] ZHENG Y, CAPRA L, WOLFSON O, et al. Urban computing: concepts, methodologies, and applications[J]. ACM Transactions on Intelligent Systems & Technology, 2014, 5(3):1-55.
[2] ??? SHANG J, ZHENG Y, TONG W, et al. Inferring gas consumption and pollution emission of vehicles throughout a city[C]// Proceedings of the 2014 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 1027-1036.
[3] ??? ZHENG Y. Trajectory data mining: an overview[J]. ACM Transactions on Intelligent Systems and Technology, 2015, 6(3): Article No. 29.
[4] ??? 高強(qiáng),張鳳荔,王瑞錦,等.軌跡大數(shù)據(jù):數(shù)據(jù)處理關(guān)鍵技術(shù)研究綜述[J]. 軟件學(xué)報(bào), 2017,28(4):959-992. (GAO Q, ZHANG F Z, WANG R J, et al. Trajectory big data: a review of key technologies in data processing[J]. Journal of Software,2017, 28(4):959-992.)
[5] ??? 毛嘉莉,金澈清,章志剛,等.軌跡大數(shù)據(jù)異常檢測(cè):研究進(jìn)展及系統(tǒng)框架[J].軟件學(xué)報(bào),2017,28(1):17-34.(MAO J L, JIN C Q, ZHANG Z G, et al. Trajectory big data: a review of key technologies in data processing[J]. Journal of Software, 2017, 28(1):17-34.)
[6] ??? YUAN J, ZHENG Y, ZHANG C, et al. Tdrive: driving directions based on taxi trajectories[C]// Proceedings of the 2010 ACM SIGSPATIAL Conference on Advances in Geographical Information Systems. New York: ACM, 2010:99-108.
[7] ??? YUAN J, ZHENG Y, XIE X, et al. Tdrive: enhancing driving directions with taxi drivers intelligence[J]. IEEE Transactions on Knowledge & Data Engineering, 2013, 25(1):220-232.
[8] ??? MOREIRAMATIAS L, GAMA J, FERREIRA M, et al. Predicting taxipassenger demand using streaming data[J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(3):1393-1402.
[9] ??? FERREIRA M, DAMAS L. Timeevolving OD matrix estimation using highspeed GPS data streams[J]. Expert Systems with Applications, 2016, 44(C):275-288.
[10] ?? YAO H, TANG X, WEI H, et al. Modeling spatialtemporal dynamics for traffic prediction[J/OL]. arXiv Preprint, 2018, 2018: arXiv: 1803.01254 [2018-12-03]. https://arxiv.org/abs/1803.01254.
[11] ?? YAO H, WU F, KE J, et al. Deep multiview spatialtemporal network for taxi demand prediction[J/OL]. arXiv Preprint, 2018, 2018: arXiv: 1802.08714 [2018-12-03]. https://arxiv.org/abs/1802.08714.
[12] ?? ZHAN X, ZHENG Y, YI X, et al. Citywide traffic volume estimation using trajectory data[J]. IEEE Transactions on Knowledge & Data Engineering, 2017, 29(2):272-285.
[13] ?? MENG C, YI X, SU L, et al. Citywide traffic volume inference with loop detector data and taxi trajectories[C]// Proceedings of the 2017 ACM SIGSPATIAL Conference on Advances in Geographical Information Systems. New York: ACM, 2017: 1-10.
[14] ?? 朱美玲,劉晨,王雄斌,等.基于車(chē)牌識(shí)別流數(shù)據(jù)的車(chē)輛伴隨模式發(fā)現(xiàn)方法[J].軟件學(xué)報(bào),2017,28(6):1498-1515. (ZHU M L, LIU C, WANG X B, et al. Vehicle accompanying pattern discovery method based on license plate recognition flow data[J]. Journal of Software,2017, 28(6):1498-1515.)
[15] ?? PEI J, HAN J, MORTAZAVIASL B, et al. Mining sequential patterns by patterngrowth: the prefix span approach[J]. IEEE Transactions on Knowledge & Data Engineering, 2004, 16(11):1424-1440.
[16] ?? AYRES J. Sequential pattern mining using a bitmap representation[C]// Proceedings of the 2002 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2002:429-435.
[17] ?? ZAKI M J. SPADE: an efficient algorithm for mining frequent sequences[J]. Machine Learning, 2001, 42(1/2):31-60.
[18] ?? WANG J, HAN J, LI C. Frequent closed sequence mining without candidate maintenance[J]. IEEE Transactions on Knowledge Data Engineering, 2007, 19(8):1042-1056.
[19] ?? GOMARIZ A, CAMPOS M, MARIN R, et al. ClaSP: an efficient algorithm for mining frequent closed sequences[C]// Proceedings of the 2013 PacificAsia Conference on Knowledge Discovery and Data Mining. Berlin: Springer, 2013:50-61.
[20] ?? FOURNIERVIGER P, WU C W, GOMARIZ A, et al. VMSP: efficient vertical mining of maximal sequential patterns[C]// Proceedings of the 2014 Canadian Conference on Artificial Intelligence. Berlin: Springer, 2014: 83-94.
[21] ?? FOURNIERVIGER P, WU C W, TSENG V S. Mining maximal sequential patterns without candidate maintenance[C]// Proceedings of the 2013 Advanced Data Mining and Applications. Berlin: Springer, 2013:169-180.
[22] ?? 濟(jì)南市政府門(mén)戶(hù)網(wǎng)站.濟(jì)南將新增500輛出租車(chē)[Z/OL]. [2018-12-03]. http://www.jinan.gov.cn/art/2014/5/24/art_1862_216217.html. (Jinan City Government Portal. Jinan will add 500 taxis [Z/OL]. [2018-12-03]. http://www.jinan.gov.cn/art/2014/5/24/art_1862_216217.html.)
[23] ?? 濟(jì)南時(shí)報(bào).濟(jì)南年內(nèi)要增500輛出租車(chē)近期開(kāi)聽(tīng)證會(huì)聽(tīng)民意[N/OL]. [2018-12-03]. http://www.sdnews.com.cn/sd/jinan/201307/t20130725_1292174.htm. (Jinan Times. Jinan will increase 500 taxis during the year recent hearings to hear public opinion[N/OL]. [2018-12-03]. http://www.sdnews.com.cn/sd/jinan/201307/t20130725_1292174.htm.)
[24] ?? 秦政,王曉芳.網(wǎng)約車(chē)注冊(cè)司機(jī)已20萬(wàn)人超半數(shù)駕駛員沒(méi)濟(jì)南戶(hù)籍[Z/OL].[2018-12-03].http://news.e23.cn/jnnews/20161025/2016A2500027.html. (QIN Z, WANG X F. The registered driver of the network car has 200,000. More than half of the drivers have no Jinan household registration[Z/OL]. [2018-12-03]. http://news.e23.cn/jnnews/20161025/2016A2500027.html.)
[25] ?? ZHANG J, ZHENG Y, QI D. Deep spatiotemporal residual networks for citywide crowd flows prediction[J/OL]. arXiv Preprint, 2016, 2016: arXiv: 1610.00081 [2018-12-03]. https://arxiv.org/abs/1610.00081.
[26] ?? XIE M, YIN H, WANG H, et al. Learning graphbased POI embedding for locationbased recommendation[C]// Proceedings of the 2016 ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016:15-24.
[27] ?? WANG W, YIN H, CHEN L, et al. STSAGE: a spatialtemporal sparse additive generative model for spatial item recommendation[J]. ACM Transactions on Intelligent Systems and Technology, 2017, 8(3): Article No. 48.