Qiowen JIANG, Yu LIU,b,*, Zirn DING, Shun SUN
aInstitute of Information Fusion, Naval Aviation University, Yantai 264001, China
bDepartment of Electronic Engineering, Tsinghua University, Yantai 100083, China
KEYWORDSBehavior pattern;Hausdorff distance;Information fusion;Spatiotemporal trajectory;Trajectory clustering
AbstractTrajectory data mining is widely used in military and civil applications, such as early warning and surveillance system, intelligent traffic system and so on.Through trajectory similarity measurement and clustering, target behavior patterns can be found from massive spatiotemporal trajectory data.In order to mine frequent behaviors of targets from complex historical trajectory data, a behavior pattern mining algorithm based on spatiotemporal trajectory multidimensional information fusion is proposed in this paper.Firstly, spatial–temporal Hausdorff distance is proposed to measure multidimensional information differences of spatiotemporal trajectories, which can distinguish the behaviors with similar location but different course and velocity.On this basis,by combining the idea of k-nearest neighbor and density peak clustering,a new trajectory clustering algorithm is proposed to mine behavior patterns from trajectory data with uneven density distribution.Finally, we implement the proposed algorithm in simulated and radar measured trajectory data respectively.The experimental results show that the proposed algorithm can mine target behavior patterns from different complex application scenarios more quickly and accurately compared to the existing methods, which has a good application prospect in intelligent monitoring tasks.
With the rapid development of target location technology such as radar,sonar,satellite,and wireless sensor network,a multilayer sensor early warning and surveillance system has been established, covering land, air, marine, and even underwater targets.1–5Many kinds of monitored targets are accurately detected, tracked and correlated, and finally recorded in the form of spatiotemporal trajectory.6–8Massive spatiotemporal trajectories are constantly generated and accumulated in the early warning and surveillance system, which contain a lot of regular information and knowledge of target behavior.In the process of movement, targets often show their characteristics and regular behaviors.For example, in air traffic control, trajectories of military and civil aircraft show different characteristics in specific spatiotemporal range due to their different movement purposes, such as longitude, latitude, height,course, velocity and so on.The target behavior pattern refers to the typical global trajectories of targets that frequently appear in a certain spatiotemporal range according to their action purpose.Through reasonable multidimensional similarity measurement and unsupervised clustering technology, we can fully mine the behavior patterns of targets during task activities from massive historical spatiotemporal trajectory data.These behavior patterns provide decision makers with knowledge of regularity, which contributes to situational awareness,command and dispatch.At present,trajectory data mining technology has become a hot topic in both military and civil fields,9–11which is of great significance for situation evaluation, threat assessment, behavior prediction, abnormal behavior detection, and intelligent command decisions.
In recent years, many researchers have conducted a large number of valuable studies on target behavior pattern analysis methods.Zheng12systematically summarized the existing trajectory data mining techniques, and constructed a behavior pattern mining framework including trajectory data preprocessing, similarity measurement and trajectory clustering.Tao et al.13compared five of the most important and commonly used similarity measures in different application scenarios.Morris and Trivedi14proposed a trajectory clustering framework for live video to analyze behaviors of surveillance subjects.Li et al.15analyzed flight data using clustering techniques to identify and mitigate risks before accidents occur.Rong et al.16proposed an Automatic Identification System(AIS)trajectory data mining method to analyze maritime traffic patterns.Reyes et al.17proposed a GPS trajectory clustering method for decision making on intelligent transportation systems.Izakian et al.18proposed an automated trajectory data clustering method using Dynamic Time Warping (DTW) and Particle Swarm Optimization (PSO).Li et al.19proposed an adaptive constrained DTW algorithm by introducing penalty function, which can adaptively adjust the corresponding relationship between spatiotemporal trajectories.With the development of Deep Learning (DL), some researchers tried to use Convolutional Neural Networks (CNN) to process trajectory information and classify target behavior patterns, which achieved some good results20–22.
Although there have been a lot of studies on the similarity measurement and clustering methods of trajectories,some limitations still need to be resolved, which are shown below.
In the research of trajectory similarity measurement, most widely used methods, such as DTW, Hausdorff Distance23(HD), Longest Common Sub-Sequence24(LCSS) and Fre′chet distance,25mainly consider the location information, ignoring the course and velocity information, which cannot distinguish the behavior patterns with similar location but different course and velocity.Lee et al.26proposed a new Hausdorff distance measurement algorithm integrating multiple horizontal, vertical and angular distance.Ansari et al.27applied similar idea to the spatiotemporal trajectories.But these methods only work well for sub-trajectory clustering, and are not suitable for some application scenarios which should cluster the whole trajectories.In recent years, some trajectory similarity measurements based on multi-feature are proposed.28,29However,this algorithm fuses multidimensional features of different dimensions through weight coefficient, which has disadvantages of difficult parameter setting and weak universality for different application scenarios.
In trajectory clustering applications, current main clustering algorithms include K-means,30K-medoids,31Spectral Clustering32(SC), Density Based Spatial Clustering of Applications with Noise33(DBSCAN) and their improvements.Kmeans, K-medoids and SC algorithms need to set the number of clustering in advance, which usually requires multiple adjustment of parameters for trajectory data without prior knowledge.In addition,they cannot avoid the influence of outlier trajectories.DBSCAN can cluster trajectories based on density and eliminate outliers, which has been widely used in behavior pattern mining methods.However, DBSCAN works better for data with uniform density distribution and less well for data with complex density distribution.Density Peaks Clustering33(DPC) proposed by Rodriguez and Laio is an advanced clustering algorithm.It only needs one parameter and can cluster data of any shape,which is applied to behavior pattern mining.34But the disadvantage is that clustering centers and outliers need human selection.
In order to solve the above problems,this paper carries out some innovative work.The main contributions of this paper can be summarized as follows:
(1) Spatio-Temporal Hausdorff Distance (STHD) is proposed to measure multidimensional information of spatiotemporal trajectory, and meanwhile reduces the complexity of Hausdorff distance.
(2) Center Trajectory Decision algorithm based on Filter Elbow method(FE-CTD)is proposed to determine the trajectory cluster centers and get rid of the outlier trajectories adaptively.
(3) Spatiotemporal Trajectory K-Nearest Neighbor Decision Clustering algorithm based on Multidimensional Information Fusion (MIF-STKNNDC) is proposed, which can cluster trajectories with complex distributions.
(4) The proposed algorithm is implemented in simulated and radar measured trajectory data respectively, which prove that it has better performance in running time,silhouette coefficient and accuracy compared with the existing methods.
The rest of this paper is organized as follows.In Section 2,spatiotemporal trajectory and STHD are introduced.Then,FE-CTD and MIF-STKNNDC are introduced in Section 3.Next,two experiments are implemented to research the performance of the proposed algorithms in Section 4.Finally, conclusions are drawn in Section 5.
The spatiotemporal trajectory data is a sequence of several target intelligence information points in chronological order.The spatiotemporal trajectory data of all targets accumulated in the early warning and surveillance system is expressed as.
where pi;krepresents the kth intelligence information data point in the ith spatiotemporal trajectory, and m is the total number of intelligence information points in the spatiotemporal trajectory tri.Due to the difference in sampling rate of various sensors and the occasional loss of information, the total number of intelligence information points is usually not the same in different spatiotemporal trajectories,even for trajectories with similar behaviour patterns.
pi;kis an intelligence information point with multidimensional information, which contains at least two important information: detection time ti;kand target location li;k.ti;k Hausdorff distance is a measure to describe the similarity between two sets, which is widely used in the measurement of trajectory similarity.Its advantage is that it does not require the same sampling rate and the number of points of the trajectory, and can measure the global features of the entire trajectory.When the trajectory is incomplete, it can still measure the similarity well, and has strong robustness.For the spatiotemporal trajectories triand trj, the meaning of Hausdorff distance is the maximum of the nearest distance between the position information point in triand the position information point in trj, and the directional Hausdorff distance from trito trjis defined as. Fig.1 Schematic diagram of Spatiotemporal trajectory. Fig.2 Schematic diagram of directional Hausdorff distance. The directional Hausdorff distance of trajectory is shown in Fig.2. In general, we use bidirectional Hausdorff distance to define the similarity between trajectories: In fact,the target location information is the result of the accumulation of course and velocity in time during its movement.Therefore, we consider using the time information in spatiotemporal trajectories to measure multidimensional feature differences through the accumulation of distance between trajectories in time.By dividing time window, we propose a multidimensional trajectory similarity measure algorithm: Spatio-Temporal Hausdorff Distance (STHD) (see Algorithm 1),which could solve the problem that Hausdorff distance only considers the location information of trajectories, ignoring the course and velocity information.STHD is constructed by the following steps: Step 1.Determine the length of time slide window τ.τ is determined by parameter ξ, which represents the number of information points that we expect in each time window.Define the length of the time slide window. Step 2.Segment the trajectory according to the length τ of the time window.The spatiotemporal trajectorytriandtrjafter segmentation are expressed as. Fig.3 Add information points at edge of time windows. It can be seen intuitively from Fig.4 that,for spatiotemporal trajectories with similar position but opposite course,STHD reaches a high degree of differentiation in the first or last few time windows.For spatiotemporal trajectories with similar position but different velocity, STHD becomes larger and more discriminative as the time window moves.In this way,the similarity measure of multidimensional feature fusion such as position, course and velocity of spatiotemporal trajectory can be realized by using a distance measurement.STHD is an implicit multidimensional information fusion, which only uses the time and position information of the target, and can still distinguish the trajectory velocity and course characteristics without reliable velocity and course measurement.Due to the introduction of time information,compared with Hausdorff distance, trajectories with similar positions but different course and velocity have good regional differentiation. It is worth mentioning that STHD algorithm complexity is O(nξ ), while Hausdorff and MFHD algorithm complexity is O(nm ),which is better than Hausdorff and MFHD.In particular,when the time window τ is large enough,STHD is equivalent to the traditional Hausdorff distance and the complexity is O(nm ).In this case, the algorithm has the maximum complexity and cannot reflect the difference of trajectory speed and heading information.When the time window τ is small enough, STHD is similar to the Euclidean distance measure and the complexity is O (n ).At this time, the algorithm has the minimum complexity and the measurement effect of speed and heading is the best. Fig.4 Schematic diagram of spatiotemporal Hausdorff distance. Algorithm 1 STHD.Input: 1.Spatiotemporal trajectory triand trj2.Time window parameter ξ Output: Trajectory similarity HSTtri;trj( )1.Δt-←Σn( )2.τ ← ξ ?Δt i=1-Σm-1k=1 ti;k+1-ti;k( )/Σni=1count pi;k■( )/τ 3.N ← Σn-1k=1 ti;k+1-ti;k■■( )/τ,M ← Σm-1k=1 ti;k+1-ti;k■4.for q ←1 to max {N;M} do 5.if q ≤min {N;M} then 6.h tri;q;trj;q( )← max&{( )}'pi;k?tri;qpj;k?trj;qdist li;k;lj;kmin 7.else 8.h tri;q;trj;q( )← max&{( )}'pi;k?tri;min{N;M}pj;k?trj;qdist li;k;lj;kmin 9.end if 10.H tri;q;trj;q( )←max h tri;q;trj;q{( );h tri;q;trj;q( )}11.end for 12.HSTtri;trj( )←max H tri;q;trj;q{( )} Aiming at the shortcomings of the existing trajectory clustering algorithm,combining the idea of k-nearest neighbour distance and decision graph in density peak clustering,we propose Spatiotemporal Trajectory K-Nearest Neighbour Decision Clustering algorithm based on Multidimensional Information Fusion (MIF-STKNNDC), which adopts STHD as the similarity measure.Here are some definitions about MIFSTKNNDC: Definition 1.(K-nearest neighbour trajectory) The k-nearest neighbour trajectory oftriis defined as the trajectory whose spatiotemporal Hausdorff distance is the kth closest totri,recorded as NN(tri;k). Definition 2.(K-nearest spatiotemporal Hausdorff distance)The spatiotemporal Hausdorff distance betweentriand its knearest neighbour trajectory is defined as the k-nearest spatiotemporal Hausdorff distance, recorded as Definition 5.(Decision value) The ratio of δito λiis defined as the decision value, recorded as which is an important parameter for center trajectory decision making. Definition 6.(Center trajectory) Arrange the decision values in descending order, and r trajectories corresponding to γiof top r are defined as center trajectories, recorded asTRc.In practical application scenarios, each center trajectory represents a behavior pattern of the targets. Definition 7.(Abnormal trajectory) Irregular trajectories could affect the mining of major behavior patterns,and these irregular trajectories, called abnormal trajectory, should be eliminated before clustering.Abnormal trajectories are divided into two types:outlier trajectory and suspicious trajectory.If an appropriate threshold Λ is set,a trajectorytri,whose λiis greater than Λ,is defined as an outlier trajectory.After clustering, a trajectory triis assigned to a trajectory cluster, and if there is a trajectory trjin Nk(tri)belonging to another trajectory cluster, trajectory triwill be defined as a suspicious trajectory. As shown in Fig.5, the framework of MIF-STKNNDC mainly consists of three steps: Step 1.(Trajectory similarity measure) Use STHD, which has been introduced in Section 2.1,to measure the multidimensional similarity of all trajectories in the spatiotemporal trajectory datasetTR. Step 2.(Center trajectory decision) Determine r center trajectories which could best reflect the behavior patterns of targets in practical application scenarios according to decision value.In order to avoid the subjectivity of human selection,we propose a Center Trajectory Decision algorithm based on Filter Elbow method (FE-CTD), which will be introduced in Section 3.2. Fig.5 MIF-STKNNDC algorithm framework. Step 3.(Trajectory clustering) Firstly, eliminate the outlier trajectories.Then center trajectories are taken as the clustering center, and the remaining trajectories are matched to the corresponding trajectory clusters.Finally,detect and eliminate the suspicious trajectories from the trajectory clusters.The details will be explained in Section 3.3. In the traditional DPC and its improved algorithm, the selection of clustering center mostly needs human intervention.33This means that the clustering results are subjective and cannot match the target behavior pattern mining technology in intelligent monitoring tasks.This section proposes a Center Trajectory Decision algorithm based on Filter Elbow method (FECTD) (Algorithm 2) to solve this problem. If the decision values of all trajectories are arranged in descending order, we can find that the decision value curve presents a state of rapid decline and then flattening, where the inflection point is the boundary point between the center trajectories and the non-center trajectories.Therefore,we consider determining the center trajectories based on elbow method36. Elbow method, as the name implies,is similar to the elbow of a person and has an inflection point, which is usually applied to the K-means clustering algorithm to obtain the optimal cluster number k through this inflection point.It is based on two curves:one calculated from the Sum of Squared Errors(SSE)and the other,an SSE line,from the start and end of the search range.Then we take k of the maximum value in the difference between the two as the final result.In the application scenario of this section,the two curves are:γ curve in descending order and γ line from the start and end of the search range.Find γ corresponding to the maximum difference between the two curves,and take this as a threshold.The trajectories whose decision value is greater than the threshold are determined as the center trajectories. The result of center trajectory determination depends on the search range.However,the selection of search range is usually based on empirical values, which is difficult to apply to trajectory big data.So,an adaptive method is needed to determine the search range.We notice that the center trajectory has an important feature: Its K-nearest temporal Hausdorff distance is not less than all its K-neighbour trajectories.Therefore, we can filter out the potential center trajectory according to this feature, and use this as the search range for elbow method to determine the center trajectory.The detailed steps of FE-CTD are shown as follows: Step 1.Calculate k-nearest spatiotemporal Hausdorff distance λiand k-neighbour trajectories Nk(tri)of all trajectories in the dataset. Step 2.According to the feature of the center trajectory,define filter factor. If the filter factor of trajectory triis less than or equal to 1,add it to the search range and calculate minimum k-nearest spatiotemporal Hausdorff distance δiand decision value γi. Step 3.Arrange the decision value γiof the filtered trajectories in descending order and define curve as l1, where num is the number of filtered trajectories and numirepresents the sequence number of trajectory triafter arranged.According to the search range, define the other line l2:γ′i=a ?numi+b,where. Step 4.For all filtered trajectories, calculate the difference between l1and l2: Find the decision value γiof the trajectory tricorresponding to the maximum value of Δγi, and set this γias a threshold Y.Finally, determine the trajectories with decision value greater than Y as the center trajectories: Algorithm 2.FE-CTD.Input: 1.Trajectory dataset TR 2.K-nearest neighbor parameter k Output: Center trajectories TRc1.for each tri;trj?TR^i≠j do 2.λ i←HSTtri;NN(tri;k))(3.Nk(tri)← trj?TR■■HSTtri;trj( )≤λi{}4.δi←min {( )}j:λj<λiHSTtri;trj5.γi←δi/λi6.ωi←λi/ min trj?Nk(tri)HSTtrj;NN (trj;k))}{(7.end for 8.num ←count tri?TR■ωi≤1}{num-1,b ←max {γi}-a 10.for each tri?TR^ωi≤1 in γidescending order do 11.Δγi←a ?numi+b-γi12.if Δγiis the maximum then 13.Y ←γi14.end if 15.end for 16.TRc← tri?TR■γi>Y{}9.a ←min {γi}-max {γi} The detailed steps of trajectory clustering are shown as follows: Step 1.Eliminate the outlier trajectories. In practical application scenarios, in addition to normal behavior patterns, trajectory dataset usually has a large number of irregular behaviors.Therefore, in order to analyze frequent behavior patterns of targets more accurately, a large number of outlier trajectories need to be removed.In traditional DPC and its improved algorithms, the removal of outliers usually requires human intervention.To solve this problem, we set a reasonable outlier threshold based on the center trajectory.Set the minimum spatiotemporal Hausdorff distance among all center trajectories as a threshold. If the k-nearest spatiotemporal Hausdorff distance of trajectory triis greater than the threshold Λ, it is judged to be an outlier trajectory and stored in the abnormal trajectory set, and does not participate in clustering. Step 2.Allocate the trajectory clusters according to the center trajectories. Taking the center trajectories as the clustering center,r trajectory clusters will be obtained as. where clusterID represents the number of the trajectory clusters, and clusterIDirepresents the number of the trajectory cluster where triis located. Step 3.Allocate the remaining trajectories. According to Definitions 4 and 5, the trajectory with the minimum k-nearest spatiotemporal Hausdorff distance in TR must be a center trajectory and occupy a trajectory cluster in Step 2.In order to ensure that all remaining trajectories can be allocated, carry out the remaining trajectories in knearest temporal Hausdorff distance ascending order.Select a trajectory triand find the closest trajectory trjfrom the trajectories whose k-nearest temporal Hausdorff distance is smaller than λi.Then, assign trito the trajectory cluster where trjis located. Step 4.Detect and eliminate the suspicious trajectories from the trajectory clusters. After clustering, a trajectory triis assigned to a trajectory cluster CclusterID,but if there is a trajectory trjin Nk(tri)belonging to other trajectory cluster, trajectory triwill be detected and eliminated as a suspicious trajectory. Compared with the traditional trajectory clustering algorithm, the advantages of MIF-STKNNDC are given as follows: (1)FE-CTD is introduced to determine the number of clusters and center trajectories adaptively from mass trajectories,which solves the problem that DPC needs to choose the clustering center manually. (2) Parameter setting is simpler than DBSCAN.MIFSTKNNDC has only one parameter, which is easy to adjust for different application scenarios.However, DBSCAN has two parameters, which is complicated in different application scenarios. (3) MIF-STKNNDC (see Algorithm 3) can cluster the trajectory data with large density distribution differences.In addition, the adaptive outlier threshold can better improve the compactness of the trajectory cluster. Algorithm 3.MIF-STKNNDC.Input: 1.Trajectory dataset TR 2.Time window parameter ξ 3.K-nearest neighbor parameter k Output: Trajectory clusters C1;C2;???;CclusterID;???;Cr{}1.for each tri;trj?TR^i≠j do 2.calculate HSTtri;trj( )by Algorithm 1 3.end for 4.Determine TRcby Algorithm 2 5.Λ ← min {( )}tri;trj?TRcHSTtri;trj6.for clusterID ←1 to r do 7.for each tri?TRcdo 8.clusterIDi←clusterID 9.end for 10.end for 11.for each tri?TR^?TRcin λiascending order do 12.if λi≤Λ then 13.for each trj?TR^λj≤λido 14.if HSTtri;trj( )is the minimum then 15.clusterIDi←clusterIDj16.end if 17.end for(continued on next page) a (continued) The experiment is carried out in a simulated spatiotemporal trajectory dataset, which simulates the behavior of unknown targets in an early warning and surveillance area.We implement the proposed algorithm on the dataset, and the experimental results are analysed and discussed in detail. 4.1.1.Dataset In this section, the experimental dataset is generated based on the trajectory generation program published by Piciarelli et al.37The program can generate cluster trajectories and random outlier trajectories, and can set the number of trajectory clusters, the number of trajectories of each cluster, trajectory distribution parameters, the number of information points of each trajectory,the number of abnormal trajectories and other parameters.On this basis,we add the time,velocity and course information in the information data points.There are 2000 spatiotemporal trajectories in this dataset, each of which contains 12 to 36 information data points.A total of 15 behavior patterns of the target are set in an early warning and surveillance area,that is,there are 15 trajectory clusters.Fig.6 shows the plot of the 2000 trajectories. Fig.6 Plot of 2000 trajectories. The specific information of the dataset is shown below.In order to prove the ability of the proposed algorithm to mine multidimensional feature of trajectories,four trajectory clusters are generated at location 1:each cluster has 150 trajectories,including one cluster of standard behavior,one cluster of twice velocity,one cluster of three times velocity and one cluster of opposite course, and the trajectory distribution parameters are 0.5.In order to prove the ability of the proposed algorithm to mine low-density behaviors from high-density behaviors,firstly,two clusters of trajectories are generated at position 2:one has 300 trajectories whose moving speed is velocity 2, and the other has 100 trajectories whose moving speed is twice of velocity 2.Then, two clusters of trajectories are generated at position 3:one has 200 trajectories and the other has 50 trajectories, and they move in opposite course.In order to prove the mining ability of the proposed algorithm for trajectory clusters with complex distribution, seven trajectory clusters with different number of trajectories and distribution parameters are generated at positions 4–10,whose specific parameters are shown in Table 1.In addition,200 random outlier trajectories are added to increase the complexity of the target behavior. 4.1.2.Experimental design We implement the proposed algorithm on this simulation dataset, and specific experimental projects and design ideas are shown as follows: (1) Set the optimal parameter combination ξ = 2 and k = 5.The process and results of mining target behavior by MIF-STKNNDC are visualized. (2) In order to prove the advantage of STHD in trajectory similarity measure, it is compared with traditional HD, and two recently proposed trajectory multi-feature similarity measurement methods: MFHD28and MFTSM.29Combine them with the classical clustering algorithms.Silhouette coefficient,accuracy and running time are used to evaluate the algorithms. (3)In order to prove the benefit of MIF-STKNNDC in trajectory behavior pattern mining, it is compared with three existing advanced behavior pattern mining algorithms: STTRACLUS,27MTCA28and TC_MFTSM.29Silhouette coefficient and running time are used to evaluate the algorithms. (4) Two important parameters of the algorithm are discussed and the performance of the algorithm under different dataset sizes is studied. 4.1.3.Result and discussion In the above experiment, we set the parameters ξ = 2 and k = 5.Fig.8 shows the distribution of the entire dataset in λ - δ coordinates, where colored points represent the corresponding trajectory clustering and black points represent outlier trajectories.Fig.9 shows the descending ordering of all trajectory decision values and the decision-making process of center trajectory.The fifteen center trajectories obtained by FE-CTD are marked with colored points. For the compared trajectory similarity measure algorithms,we cluster trajectories using four classical clustering algorithms: K-medoids, SC, DBSCAN and DPC.Experiments are repeated with different parameters, and the optimal value of the results of many tests is taken as the final results, as shown in Table 2. Table 1 Simulation dataset. Fig.7 Plot of 15 behavior patterns. Silhouette coefficient is an intrinsic evaluation method,which can reflect the compactness of the cluster.The larger the contour coefficient is, the more similar the target behavior is in the same cluster,and the less similar the target behavior is in different clusters.Accuracy is an external evaluation method, which refers to the proportion of the trajectory consistent with the trajectory cluster determined by the benchmark in the simulation dataset to all the trajectory in the clustering result cluster in the same cluster. It can be seen from Table 2 that, compared with HD and MFTSM, STHD has the best overall performance in different trajectory clustering algorithms.From the perspective of trajectory clustering algorithms, compared with K-Medoids,SC, DBSCAN and DPC, trajectory clusters obtained by MIF-STKNNDC are more compact.From the perspective of algorithm accuracy, the combined performance of STHD and MIF-STKNNDC proposed in this paper is the best.MIF-STKNNDC is more suitable for complex datasets with large trajectory density changes than DBSCAN.The accuracy of K-Medoids algorithm is not high because the initial cluster center is random and the outlier trajectories cannot be detected.Similarly, SC has low overall performance because it needs to set the number of clusters in advance and is greatly affected by outliers.DPC requires manual operation in the selection of cluster center,which introduces subjectivity.Hausdorff can only measure the location information between trajectories, but not the speed and heading information, so the accuracy of the overall mining algorithm is not high.From the perspective of algorithm running time, except that kmedoids clustering iteration consumes a lot of time,other time is mainly spent on trajectory similarity measurement.STHD is lower than MFHD and Hausdorff in algorithm complexity,avoiding unnecessary information point distance operation and greatly reducing the running time of the whole algorithm. Fig.8 Distribution of trajectories in λ - δ coordinates. Table 3 shows the performance of MIF-STKNNDC and three existing advanced behavior pattern mining algorithms:ST-TRACLUS, MTCA and TC_MFTSM on this trajectory dataset.It can be seen from the experimental results that the overall effect of MIF-STKNNDC is optimal.Among the three algorithms compared, the essence of MTCA is an improved DBSCAN algorithm.Although it can cluster the multidimensional features of trajectories,the effect of DBSCAN is slightly lower than MIF-STKNNDC because DBSCAN cannot cluster the data with uneven density very well.The essence of TC_MFTSM is an improved K-means algorithm,and its clustering results depend on the preset number of track clusters.Although its clustering result is good, it is based on several attempts of clustering parameters.ST-TRACLUS has the function of mining multidimensional features of trajectories,but it is mainly applied to the clustering of trajectory segments,so it does not perform well in the scene of this paper. Fig.9 FE-CTD determines 15 center trajectories. MIF-STKNNDC has two parameters ξ and k.We scaled up and down the dataset and studied the performance of different parameters in the dataset of different sizes. Fig.10 shows the running time of the trajectory similarity measure when STHD parameter ξ = 2–6.Compared with STHD and Hausdorff, the results show that STHD is much less than STHD and Hausdorff in computing overhead, and the calculation efficiency is the highest when ξ = 2. Fig.11 shows the accuracy of trajectory clustering when MIF-STKNNDC parameter k = 3–8.As can be seen from the figure, parameter k and dataset size have a certain impact on algorithm performance.When the size of the dataset is small, the smaller parameter k has better performance.With the increase of the data size, the performance of the larger parameter k becomes better, while the performance of the smaller parameter k decreases.The results show that when the data size is less than 4000,k=5 has a good performance,and when the data size is larger, a larger k may be required. Fig.10 Running time of different similarity measure in data of different sizes. The experiment is carried out in a radar-measured spatiotemporal trajectory dataset, which records the flight behaviors of flights in an airfield.We implemented MIF-STKNNDC on this dataset, and proved the practicality of the algorithm. 4.2.1.Dataset This dataset contains the records of all the flights in the Northern California TRACON, which is provided by the aircraft noise abatement office of San Francisco International Airport(SFO) available online at https://c3.nasa.gov/dashlink/resources/132/.Each record contains some information about the flight and a sequence of 3D position and estimated speed.We selected some flight data for a period of time in January 2006.After data cleaning, some invalid trajectories are removed, and there are 2991 usable flight trajectories in total,which are shown in Fig.12. Table 2 Comparison of similarity measure algorithms’performance in different clustering algorithms. Table 3 Comparison of performance among behavior pattern mining algorithms. Fig.11 Accuracy of different k in data of different sizes. 4.2.2.Result and discussion We implement MIF-STKNNDC on this dataset and seventeen trajectory clusters are obtained as{C1;C2;???;C17}.In order to clearly represent the behavior patterns of the flights, we select the center trajectories as the representative to display in Fig.13, where the arrows in the figure represent the direction of movement.It is very difficult for us to discover the target behavior patterns in Fig.12.However,seventeen behavior patterns are intuitively shown in Fig.13. Fig.12 Plot of 2991 trajectories in 2D and 3D coordinate system. Fig.13 Plot of 17 behavior patterns in 2D and 3D coordinate system. Fig.14 Distribution of trajectories in λ - δ coordinates. Fig.15 FE-CTD determines 17 center trajectories. In the above experiment, we set the parameters ξ = 2 and k = 10.Fig.14 shows the distribution of the entire trajectory dataset in λ - δ coordinates,where the colored points represent the trajectory clustered and black points represent outlier trajectories and suspicious trajectories.Fig.15 shows the descending ordering of all trajectory decision values and the decisionmaking process of the seventeen center trajectories which are marked with colored points. In order to evaluate the practicality of MIF-STKNNDC,we match the mined flight behaviors with the airport records that we queried.It is found that these regular behaviors are closely related to the main flight courses of SFO, Oakland International Airport (OAK), San Jose International Airport(SJC) and other airports.These relationships mean that MIF-STKNNDC has a good performance in mining behavior patterns of unknown targets in early warning and surveillance scenarios.The results of experiment indicate that MIFSTKNNDC has a good application value in civil aviation surveillance system, which is helpful to improve the intelligent level of civil aviation surveillance system. (1) A target behavior pattern mining method based on spatiotemporal trajectory multidimensional information fusion is proposed in this work. (2) STHD can measure multidimensional feature differences of spatiotemporal trajectory using only time and location information, and the algorithm complexity is better than that of MFHD and HD. (3) MIF-STKNNDC can mine target behavior patterns from the trajectory data with complex distribution, and is superior to traditional trajectory clustering algorithms in accuracy, silhouette coefficient and running time. (4)In the next work,experiments on larger scale spatiotemporal trajectory datasets such as AIS and ADS-B systems will be implemented.Based on this work,online methods of target behavior patterns mining will be studied in the future. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements This study was co-supported by the National Key R&D Program of China (No.2021YFA0715202), the National Natural Science Foundation of China (Nos.62022092, 61790550 and 62171453) and the Outstanding Youth Innovation Team Program of University in Shandong Province, China (No.2021KJ005).2.2.Hausdorff distance
2.3.Spatiotemporal Hausdorff distance
3.Behavior pattern mining
3.1.Definitions and framework
3.2.Center trajectory decision
3.3.Trajectory clustering
4.Experiments
4.1.Simulation scene experiments and analysis
4.2.Actual scene experiment verification
5.Conclusions
CHINESE JOURNAL OF AERONAUTICS2023年4期