Qiang Lai(賴強(qiáng)) and Hong-Hao Zhang(張宏昊)
School of Electrical and Automation Engineering,East China Jiaotong University,Nanchang 330013,China
Keywords: transportation network,key node identification,KSD identification method,network efficiency
With the development of science and technology, there are more and more ways to share information, and the networking of social groups is becoming more and more obvious.Social networks are considered to be the most powerful tool for disseminating useful information to users.The stable operation of various complex network systems is the guarantee of a safe life. The meanings of network elements are also different in different networks. For example,in the circle of friends,the nodes of the network represent users,and the edges of the network represent the communication between users;the network nodes in the transportation network represent stations,and the edges of the network represent the reachability of vehicles. In a complex network, there are some key nodes, which have a huge impact on the structure and function of the network. If the key nodes stop running when the system is under external attack, the entire network will quickly collapse. Accurately identifying and protecting key nodes in the network is of great significance to the stability of the network.For example,it will be more effective for influential people to forward positive energy topics in social networks. Controlling susceptible people in virus transmission networks can prevent large-scale virus infections, and protecting high-traffic sites in the transportation networks can effectively avoid traffic congestion.[1–10]
In recent years,the identification of key nodes in complex networks has become a hot spot in network scientific research,and more and more scientific research results have been proposed. The key node identification methods can be divided into two categories,local identification,and global identification. The method starting from the attributes of the node itself is called the local identification method. Albertet al.[11]proposed a method to identify key nodes in the network based on node degree,which laid the foundation for future research.Chenet al.[12]proposed the cluster-rank indicator,which takes into account the degree of nodes and the clustering coefficient at the same time, the experimental results show that the recognition accuracy of the index is excellent. Zhenget al.[13]proposed a node centrality measurement index based on local topological information, which considers multiple influencing factors. Isaiahet al.[14]proposed a closeness centrality method for power grid voltage stability analysis. The simulation results show that the proposed index can be used as a tool for rapid and real-time evaluation of power system voltage stability. Chenet al.[15]proposed a semi-local centrality index,which expands the coverage of the node field. Ullahet al.[16]proposed a local centrality measurement algorithm.The experiments in different scales of real-world networks have proved its superiority in performance. Zhuet al.[17]found that it is more effective to combine the concentration information of the node itself with the concentration information of the neighboring nodes to identify key nodes. The method of identifying key nodes from the position of the node on the entire network is called global recognition. Kitsaket al.[18]proposed ak-shell decomposition algorithm. In this method,the highestk’s-value index node is called the core node of the network, and the smallestk’s-value index node is called the peripheral node. Yanget al.[19]improved thek-shell decomposition method by considering the location of the node to improve the recognition efficiency. Amritaet al.[20]consider the node degree and neighbor node degree to improve the KSD algorithm,which can significantly obtain the best propagation dynamics from different types of network connection structures. Huanget al.[21]proposed a weightedK-order propagation number algorithm, which only needs to remove a small number of key nodes to achieve complete damage to the network. The identification methods mentioned above have their strengths. For networks with different structures, choosing different key nodes identification methods may get different results, so it is meaningful to choose a suitable identification method.
Based on the purpose of obtaining key node identification methods corresponding to various traffic networks,this paper establishes the topology model of Nanjing Metro,Wuhan Metro, Chengdu Metro, Chengdu Public Transport,Chongqing Public Transport,and Shenzhen Public Transport.The network efficiency and the largest connected subgraph are used as the effect indicators. The node degree identification method, the neighbor node degree identification method, the KSD identification method, the DKS identification method,and the DKSN identification method are used to identify the key nodes of the network.The results show that the KSD identification method has the best recognition effect.
The paper is organized as follows. Section 2 introduces five identification methods. Section 3 establishes the topological structure model of the selected transportation network and analyzes the efficiency of the identification methods,and Section 4 summaries the conclusions.
The complex network degree refers to the number of edges of nodes, and in the transportation network, it refers to the reachability between two sites. This method is to arrange the nodes in descending turn of degree, and the nodes with a large degree are the key nodes. The formula for calculating the degree is as follows:[22]
whereNis the total number of nodes in the network,nodejis a neighbor of nodeiwhenai j=1.
The neighbor recognition node degree identification method is a supplement to the degree identification method.A nodeiclassified as a non-key node may increase its importance level because of its connection with many key nodes.The neighbor node degree refers to the sum of all neighbor nodes’degrees of a node. The method is to arrange the nodes in descending turn of the neighbor node degree,the nodes with a large neighbor degree are the key nodes. The formula for calculating the neighbor node degree is given as follows:[23]
whereN(i)is the neighbor set of a nodeiandkjis the neighbor node’s degree of nodei.
It may not be accurate enough to only use global indicators or local indicators to identify key nodes. Therefore, it is very meaningful to combine these two indicators to identify nodes. The degreek-shell identification method (DKS) is a combination ofk-shell and degree to change itsdksivalue,and the degreek-shell neighborhood identification method(DKSN)is a combination ofk-shell and neighbor node degree to change itsdksnwivalue. In these two methods, there is no weight between thek-shell index and the degree value. The formulas for calculatingdksianddksnwiare as follows:[20]
whereksiandksjarek-shell indices of nodeiandjrespectively.kiandkjare the degree of nodeiandjrespectively.Γiare the nearest neighbors of a nodei,w′ijis the edge’s weight between the nodeiandj.
Different from the DKS and the DKSN identification method,the weightedk-shell degree neighborhood identification method (KSD) combines degree, the neighbor node degree, andk-shell, with adjustable parametersαandμ. By changing the values ofαandμ,the weights of the three indicators can be adjusted to get the best results. The formulas for calculatingksdwiare as follows:[20]
whereksiandksjarek-shell indices of nodeiandjrespectively.kiandkiare the degrees of nodeiandjrespectively.αandμare the positive tunable parameters whose range lies between[0,1].Γiare the nearest neighbors of a nodei,wijis the edge’s weight between the nodeiandj.
In order to find the most suitable value ofαandμ, this article takes multiple samples in the interval of[0,1],figure 1 is a part of the experimental diagrams.
Fig.1. The effect of the KSD identification method under different values of α and μ. (a)α =0.2,μ =0.9;(b)α =1,μ =1;(c)α =0.9,μ =0.2.
In Fig. 1, the effect diagrams of the KSD identification method under some different values ofαandμare given. The index in the figure is the relative size of the largest connected subgraph. It can be seen from the figure that whenα=1,μ=1, and the node attack ratio is 0.055, the recognition effect of KSD is better than the other two identification methods for the first time, and the recognition effect for some time is not stable. As the ratio ofα/μdecreases,the recognition effect of KSD is getting worse. Whenα=0.2 andμ=0.9,the KSD recognition effect does not even appear to be the best,and basically,the neighbor node degree identification method has the best effect. As the ratio ofα/μincreases,the recognition effect of KSD is getting better. Whenα=0.9,μ=0.2,and attack ratio is 0.184 that the recognition effect of KSD is the best. In the following attacks, the recognition effect of KSD has always been better than the other two identification methods. Based on the above analysis,α=0.9,μ=0.2 andα=0.2,μ=0.9 are selected as the control experiments. In either case,the fastest index decline under the KSD identification method can prove that the KSD identification method has the best recognition effect.
Evaluating the efficiency of key node recognition can be obtained by observing the robustness of the network. In this paper, the largest connected subgraph and network efficiency are selected as robustness indicators.
The subgraph that connects all nodes of the network with the fewest edges is called the largest connected subgraph.If the system is attacked, the size of the largest connected subgraph will also change accordingly, which can reflect the changes in the internal structure of the network to a certain extent. The formula for calculating the largest connected subgraph is as follows:[24]
whereN′is the number of remaining nodes in the network,Nis the total number of nodes in the network.
Fig.2. Comparison chart of static attacks and dynamic attacks for(a)dynamic largest connected subgraph, (b)dynamic network efficiency, (c)static largest connected subgraph,(d)static network efficiency.
The average value of the efficiency between pairs of nodes in the network is called the network efficiency. The formula for calculating the network efficiency is as follows:[25]
whereNis the total number of nodes in the network,dijis the shortest path from nodeito nodej.
All the above recognition effect is evaluated by observing the changes of the two indicators after the nodes are deleted.The faster the decline,the better the recognition effect.
The deliberate attacks on the network are divided into two categories, one is static attacks and the other is dynamic attacks.[26]Static attacks are attacks on the network in the originally calculated turn. Dynamic attacks will recalculate the importance of nodes after each round of attacks, and attack the network based on the results of the recalculation. The following is a comparison chart of static attacks and dynamic attacks under the same conditions.
As can be seen from the figures,dynamic attacks are generally much better than static attacks. The best effect under static attack is the degree identification method, When the node failure ratio is 0.05,the network efficiency is 0.07,while in dynamic attacks, the value is 0.06. Therefore, the article chooses to attack the network in a dynamic attack mode, so that more accurate results can be obtained.
The article selects Nanjing Metro,[26–29]Wuhan Metro,Chengdu Metro,Chengdu Public Transport,Chongqing Public Transport,and Shenzhen Public Transport as the objects,and establishes its topology model with the stations as the nodes and the trajectory of the vehicles as the edges. The data used are all from the Internet, and the acquisition time is August 2020. Table 1 shows the topological data of the selected six networks, whereNis the number of nodes,Mis the number of edges,〈K〉is the average degree. Figure 3 gives the established topology models of the corresponding networks.
Table 1. Topological data of six networks.
Fig. 3. Topological structure diagram of transportation networks for (a1) Chengdu Metro, (b1) Chengdu Public Transport, (c1) Nanjing Metro, (a2)Wuhan Metro,(b2)Shenzhen Public Transport,(c2)Chongqing Public Transport.
The article uses dynamic attacks to attack the networks, and the changes in the two indicators obtained are shown in the following figures(Figs.4–9).
It can be concluded from the above figures in the following points.
(i)For Chengdu Metro,in the case ofα=0.2,μ=0.9,the network efficiency of each identification method is similar,but when the node failure ratio is 0.26,the maximum connected subgraph of the neighbor node degree identification method drops the fastest and remains until the network collapses. In the case ofα=0.9,μ=0.2,when the node failure ratio is 0.06,the two indicators of the KSD identification method have the fastest decline and remains until the network collapses. Sort the recognition effects from largest to smallest,and the result is the KSD method,the ND method,the DKSN method,the DKS method,and the degree method.
Fig. 4. Change graph of the largest connected subgraph under different values of α and μ in Chengdu Metro for (a) α =0.9, μ =0.2; (b) α =0.2,μ =0.9.
Fig.5. Change graph of the network efficiency under different values of α and μ in Chengdu Metro for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
Fig. 6. Change graph of the largest connected subgraph under different values of α and μ in Nanjing Metro for (a) α =0.9, μ =0.2; (b) α =0.2,μ =0.9.
Fig.7. Change graph of the network efficiency under different values of α and μ in Nanjing Metro for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
(ii)For Nanjing Metro,in the case ofα=0.2,μ=0.9,the two indicators of the neighbor node degree identification method and the KSD identification method have little difference, and when the node failure ratio is 0.08, they are better than the other methods. In the case ofα=0.9,μ=0.2,the effect of the KSD identification method is slightly better than that of the neighbor node degree identification method when the node failure ratio is 0.15, which is the best of these identification methods. Sort the recognition effects from largest to smallest,and the result is the KSD method,the ND method,the DKSN method,the DKS method,and the degree method.
Fig.8. Change graph of the largest connected subgraph under different values of α and μ in Wuhan Metro for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
Fig.9. Change graph of the network efficiency under different values of α and μ in Wuhan Metro for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
(iii)For Wuhan Metro,the two indicators of the neighbor node degree identification method decrease the fastest in the case ofα=0.2,μ=0.9 when the node failure ratio is 0.2,but in the case ofα=0.9,μ=0.2,the decline in the two indicators of the KSD identification method is slightly faster than that of the neighbor node degree identification method when the node failure ratio is 0.21 and remains until the network collapses. Sort the recognition effects from largest to smallest, and the result is the KSD method,the ND method,the DKSN method,the DKS method,and the degree method.
Fig.10. Change graph of the largest connected subgraph under different values of α and μ in Chengdu Public Transport for(a)α =0.9, μ =0.2;(b)α =0.2,μ =0.9.
Fig.11. Change graph of the network efficiency under different values of α and μ in Chengdu Public Transport for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
Fig.12. Change graph of the largest connected subgraph under different values of α and μ in Chongqing Public Transport for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
Fig.13. Change graph of the network efficiency under different values of α and μ in Chongqing Public Transport for(a)α=0.9,μ=0.2;(b)α=0.2,μ =0.9.
Fig.14. Change graph of the largest connected subgraph under different values of α and μ in Shenzhen Public Transport for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
Fig.15. Change graph of the network efficiency under different values of α and μ in Shenzhen Public Transport for(a)α =0.9,μ =0.2;(b)α =0.2,μ =0.9.
It can be concluded from Figs.10–15.
(I)For Chengdu Public Transport,the network efficiency of the DKS identification method and the KSD identification method are not much different, when the node failure ratio is 0.075, the maximum connected subgraph of the two identification methods drop the fastest and remains until the network collapses. But in the case ofα=0.9,μ=0.2,the two indicators of the KSD identification method have the fastest decline when the node failure ratio is 0.08.Sort the recognition effects from largest to smallest, and the result is the KSD method,the DKS method,the ND method,the degree method,and the DKSN method.
(II)For Chongqing Public Transport,whether in the case ofα=0.2,μ=0.9 orα=0.9,μ=0.2, the two indicators of the KSD identification method have the fastest decline,and both reach the optimum when the node failure ratio is 0.065.Sort the recognition effects from largest to smallest, and the result is the KSD method, the DKS method, the ND method,the degree method,and the DKSN method.
(III) For Shenzhen Public Transport, in the case ofα=0.2,μ=0.9, the network efficiency of the node degree identification method drops the fastest when the node failure ratio is 0.07, but in the case ofα=0.9,μ=0.2, the decline in the two indicators of the KSD identification method drop the fastest when the node failure ratio is 0.11 and remains until the network collapses. Sort the recognition effects from largest to smallest, and the result is the KSD method, the ND method,the DKS method,the degree method,and the DKSN method.
In general, whether it is for the subway network or the bus network, whenα=0.9,μ=0.2, the recognition effect of the KSD identification method is always better than other methods.
It is of practical significance to study the key nodes identification methods corresponding to various types of transportation networks,the article uses the largest connected subgraph and network efficiency as evaluation indicators,and selects the node degree identification method,the neighbor node degree identification method, the KSD identification method,the DKS identification method, and the DKSN identification method to study the recognition effect of different types of transportation networks.And the paper selects Nanjing Metro,Wuhan Metro, Chengdu Metro, Chengdu Public Transport,Chongqing Public Transport, and Shenzhen Public Transport as the objects.The simulation results show that whether it is in the subway network or the bus network,in the case ofα=0.9,μ=0.2, the recognition efficiency of the KSD identification method that considers both local factors and global factors is the best among the selected recognition methods. Therefore,to improve the recognition effect of key nodes needs to be considered comprehensively. Local factors are as important as global factors. By changing the weights of the factors, better recognition results can be obtained, which is of guiding significance for the planning of the transportation network. In addition, there is a community structure with relatively close node connections.The community structure has a guiding significance for practical problems. In the future,we plan to add a community structure to the identification method to further improve the recognition accuracy.
Acknowledgements
Project supported by the National Natural Science Foundation of China (Grant No. 61961019) and the Youth Key Project of the Natural Science Foundation of Jiangxi Province of China(Grant No.20202ACBL212003).