Haining Yu,Lailai Yin,*,Hongli Zhang,Dongyang Zhan,2,Jiaxing Qu and Guangyao Zhang
1School of Cyberspace Science,Harbin Institute of Technology,Harbin,150001,China
2The Ohio State University,Columbus,43202,USA
3Heilongjiang Province Cyberspace Research Center,Harbin,150001,China
4National Internet Emergency Response Center,Harbin,150001,China
Abstract: Road networks have been used in a wide range of applications to reduces the cost of transportation and improve the quality of related services.The shortest road distance computation has been considered as one of the most fundamental operations of road networks computation.To alleviate privacy concerns about location privacy leaks during road distance computation, it is desirable to have a secure and efficient road distance computation approach.In this paper, we propose two secure road distance computation approaches,which can compute road distance over encrypted data efficiently.An approximate road distance computation approach is designed by using Partially Homomorphic Encryption and road network set embedding.An exact road distance computation is built by using Somewhat Homomorphic Encryption and road network hypercube embedding.We implement our two road distance computation approaches, and evaluate them on the real cityscale road network.Evaluation results show that our approaches are accurate and efficient.
Keywords: Road network; road distance; homomorphic encryption
Nowadays, road networks have been widely used in many application domains for sciences and engineering, such as closeness testing in spatial social networks, route planning, ride hailing or navigation in road networks, etc.For example, online ride hailing services such as Uber and DiDi employ large road networks with millions or even billions of vertices and edges in their operation.A well-maintained road network computation system plays a significant role.It not only reduces the cost of transportation, both in terms of money and time, but also improves the quality of upper services.
The shortest road distance computation has been considered as one of the most fundamental operations of road networks computation and has a wide range of applications.There are many efficient shortest distance (path) algorithms, such as Dijkstra’s algorithm and Bellman Ford’s Algorithm.In some application scenarios, the shortest road distance must be computed in encryption form to avoid privacy leaks.As an example, an online ride hailing service enables a rider to find the closest driver to offer ride service.To enjoy this service, both riders and drivers have to update their locations to the online ride hailing service provider, while the service provider computes the shortest road distances from the rider to all drivers, and select the closest driver.But the service providers are not always honest, they may track users or infer their profiles for economic advantage.To alleviate this privacy concerns, the riders and the driver submit their encrypted locations, and the service provider can compute the encrypted road distance over received ciphertexts.However, it is not a trivial problem to compute shortest road distance in ciphertext domain.Some schemes have been presented in the literature to compute shortest road distance in a secure manner, which make use of cryptographic primitives to encrypt the road network itself or the corresponding pre-generated index, e.g., Partially Homomorphic Encryption (PHE), Somewhat Homomorphic Encryption (SHE), Yao’s garbled circuits (GCs).Shen et al.[1] proposed a graph encryption scheme based on symmetric-key primitives and SHE,which enables approximate constrained shortest distance queries.Meng et al.[2] presented three schemes based on distance oracle and structured encryption for approximate shortest distance queries.Wang et al.[3] proposed a secure Graph DataBase (SecGDB) encryption scheme based on PHE and Yao’s GCs, which supports exact shortest distance/path queries.Wu et al.[4] proposed an efficient cryptographic protocol for fully-private navigation based on compressing the next-hop routing matrices, symmetric Private Information Retrieval (PIR) and Yao’s GCs.However, existing schemes are not efficient enough to compute large-scale shortest distances in real time.
To tackle the practical limitations of the state-of-the-art, we propose two secure road distance computation approaches, which can compute road distance over encrypted data efficiently.We summarize main contributions as:
? We propose an efficient approximate road distance computation approach over encrypted data, by using PHE and road network set embedding.Our approach only needs several additive homomorphic operations to compute an encrypted approximate road distance.
? We propose an efficient exact road distance computation approach over encrypted data,by using SHE and road network hypercube embedding.Our approach only needs several additive and multiplicative homomorphic operations over packed ciphertexts to compute an encrypted exact road distance.
? We implement our approximate road distance computation approach using Paillier Cryptosystem and exact road distance computation approach using FV scheme.Their performance is evaluated on the real city-scale road network.Evaluation results show that they achieve high accuracy, and keep efficient.
The remainder of this paper is structured as follows.In Section 2, we briefly introduce necessary preliminaries.In Section 3, we propose two road distance computation approaches over encrypted data.In Section 4, we evaluate their performance.Finally, we review the related literature and summarize the paper.
Partially homomorphic encryption (PHE) allows to carry out operations over ciphertexts.Paillier cryptosystem [5] is a popular PHE scheme, which relies on the decisional composite residuosity assumption.We briefly summarize it as follows for better understanding and description of our plan.
FV scheme [6] is a widely used SHE scheme which can support a finite number of both multiplications and additions on data in the cipher domain.Mathematically, FV scheme depends on a hard computation problem named as Ring Learning with Errors (RLWE) problem.SetRt=Zt[x]/(xn+1), and in the ring structureRt,xnwill be converted to –1.The plain text space in FV scheme isRt, and the cipher text is the polynomial array inRq.Givenwbeing a base,?+1=+1 represents the number of terms when the integer in the baseqis decomposed into the basew.The?+1 polynomials is obtained by decomposing the polynomials inRqinto base-wcomponents coefficient-wise.WithaSwe uniformly sampleafrom the finite setS,and [·]qrepresent reduction moduloqinto the interval(-q/2,q/2].The FV scheme is briefly introduced as follows.
In this section, we propose two different methods to compute road distance in ciphertext domain.
3.1.1 Road Network Set Embedding
By using Road Network Set Embedding (RNSE) technique [7], the planar road network can be converted into a high-dimensional space, in which we can convert the complex calculation of the shortest road distance into a simple calculation supported through existing encryption primitives.
We model the road network as a weighted planar graphG=(V,E,W).DefineVas the set of vertices inG(i.e., road junctions) andEas the set of edges inG(i.e., road sections).Let R be a set of subsets ofVand it describes a high-dimensional embedding space:
whereαandβare equal toO(log|V|).SubsetVi,jis composed of 2inodes randomly selected fromV.The shortest road distance between nodev∈Vand subsetVi,jcan be calculated by
Based on the above definition, the coordinate of a nodev, which is a vector with O(log2|V|)dimensions, is defined as the distance from the nodevto each subset:
Then we can use Ω={cv|v∈V}to represent the embedded road network ofG.
For the coordinate of a positionlon the road section(vs,vd)∈E, it can be denoted by
wheredistR(l,Vi,j)=min{distR(l,vs)+distR(vs,Vi,j),distR(l,vd)+distR(vd,Vi,j)}.
Without losing generality, let the embedded road network haveωdimensions, s.t.ω≤log2|V|By calculating the chessboard distance from csto cd, the shortest distance from locationlstoldcan be approximately represented as
wheredistC(·,·)denotes the chessboard distance amid two coordinates.
3.1.2 Encrypted Approximate Road Distance Computation Using Paillier Cryptosystem
Suppose that the road network is represented byG=(V,E,W)and the dimension of the embedded road network isω, we can use the RNE technique described in Section 3.1.1 to calculate the coordinates of each point inG, where the coordinate is a vector ofωdimension.Then, we will obtain the embedded road network denoted by ΩA={cv|v∈V}.Given a pair of points on the road networks(ls,ld), let cs=(cs[1],...,cs[ω])and cd=(cd[1],...,cd[ω])denote the coordinates oflsandldin the embedded road network ΩA.
The coordinates clsand cldcan be encrypted element-by-element using the public keypkPHEgenerated by Paillier cryptosystem, respectively.The encrypted coordinates are represented as
The encrypted approximate road distance betweenlsandldcan be computed as follows:
1) Compute the ciphertext vector [[dist(ls,ld)]] over [[cs]] and [[cd]] based upon homomorphism operations of the Paillier cryptosystem:
Note that EncPHE(2?)is used to make sure that every element in [[dist(ls,ld)]] is positive.
2) Because Paillier cryptosystem has a plaintext space much larger than the upper limit of the road distance, several ciphertexts can be packed into one ciphertext using ciphertext packing technology, which can improve the efficiency.In the Paillier cryptosystem, assuming thatp=N/(?+1)is the number of slots in a single packed ciphertext,pciphertexts can be packed into one ciphertext.The main idea of the ciphertexts packing technique is described as follows.Assuminga1,...,alare integers of?bits (1 ≤l≤p), we use [[a1]],...,[[al]] to express their ciphertexts.The packed ciphertext is constructed by [[[a1]]|···|[[al]]]=2?(l-i).
After performing a decryption operation, we can get the packed plaintext [a1|···|al] =and then recovera1,...,al.Using the above ciphertext packing technique, every encrypted element in [[dist(ls,ld)]] can be packed into a same packed ciphertext:
[[dist(ls,ld)]]=[[[dist1(ls,ld)]]|···|[[distω(ls,ld)]]]=[[disti(ls,ld)]]2?(ω-i), s.t.,p≥ω.
3) The approximate road distance betweenlsandld, i.e.distA(ls,ld), is the maximum element in dist(ls,ld), which is hidden in [[dist(ls,ld)]].Further post-process is required to extract the maximum from dist(ls,ld).In the simplest way, [[dist(ls,ld)]] can be decrypted with the secret keyskPHE, and then unpacked to recover dist(ls,ld).There are, of course, other more complex scenes,where the maximum needs to be selected in an oblivious manner.For these scenes, some wellestablished cryptographic tools can be integrated, such as Yao’s garbled circuit and secret sharing scheme.More details about secure comparison over encrypted integers can be referred to [8–11].
3.2.1 Road Network Hypercube Embedding
Them-dimensional hypercubeHm, is a graph whose node setVconsists of 2m m-dimensional boolean vectors, i.e., vectors with binary coordinates 0 or 1, where two nodes are adjacent whenever they are different in exactly one coordinate.Moreover, the size ofHmism2m-1and its older is 2m.Road Network Hypercube Embedding (RNHE) technique [7] is concerned with finding mappings between a road network and a higher-dimensional hypercube that preserve certain topological properties.Let the weighted graphG=(V,E,W)represent the road network.The vertices set isVand the edge set isE.Every edge(vi,vj)inEis related to a weightW(vi,vj), which denotes the road distance of the edge.For a vertexvi∈V, its coordinate can be expressed by a boolean vector viwithmdimensions, and we use ΩH={vi|vi∈V}represents the embedded road network.We can obtain the shortest road distance between arbitrary two vertices by computing Hamming distance between their coordinates.Note that the location in the road network and its position in corresponding planar graph can typically be converted from one to the other, and hereafter they are used interchangeably.
Related definitions are as follows.
? Theinterior face Findicates a cycle ofGsurrounding a connected domain, and theouter faceofGindicates an unbounded face.
? Aneven(odd) face means a face witheven(odd) edges.LetdiaFbe the diameter ofF, and ifd(vi,=d(vj,=diaFholds, edgese=(vi,vj)and $e′=in faceFareopposite.For each edgee∈F, it will have a unique opposite edge whenFis an even face and have two opposite edges whenFis an odd face.
? A cutLmeans a series of edges {e1,e2,e3,...,ek}satisfying the following three properties:1) Eithere1=ekorekande1are all the edge of a outer face; 2) ?F(ei,ei+1∈F); 3) in faceF, edgeseiandei+1are opposite.If graphGremoves the edges of a cutL, thenGis divided into two subgraphs {G/L}0and {G/L}1.
? Analternating cutrefers to a cut which alternates on the odd faces.In other words, if the cut turns left (resp.right) on an odd face, then it turns right (resp.left) on the following odd face.We can view an alternating cut as a line which passes through the graph and intersects only the selected edges.
The embedded road network ΩHcan be constructed byGas follows.Each alternating cutLcorresponds to two connected components {G/L}0and {G/L}1.The coordinate of every vertex in {G/L}0will be appended with 0 and the coordinate of every vertex in {G/L}1will be appended with 1.We can find all alternating cuts which containeas below.
? Starting withe, we move in both directions, take opposite edge on even face and end when we meet the first odd face in both directions.
? Next, we turn right on one odd face and left on the other (we can obtain more alternating cuts by changing the selection of odd face).
? Proceeding in both directions, we alternate at all odd faces and end up with reaching the outer face.Clearly, the coordinate is anm-dimensional boolean vector, wheremis the total number of alternating cuts.At last,Gcan be embedded into anm-dimensional hypercubeHm.
The computational complexity of hypercube embedding isIn Fig.1, the road network corresponds to the hypercubeH14, and its embedded road network is shown in Tab.1.Note that above hypercube embedding is not affected by different road network topology.
Figure 1:An example of alternating cuts of a road network (a) A simple road network.(b)Alternating cuts (e.g., L0 and L12 )
Table 1:The embedded road network of the road network in Fig.1
Let ΩHbe the embedded road network of the road networkG.The coordinates of two nodesvs,vd∈Vin ΩHare expressed as vs=(vs[0],...,vs[m-1])and vd=(vd[0],...,vd[m-1]).We can calculate the shortest road distance fromvstovdas below.
wheredistE(vs,vd)means the exact shortest road distance,distH(·,·)means the Hamming distance.In Fig.1, the shortest road distance fromvatoveis calculated bydistE(va,ve)=va,ve)=6.
Givenl=(v,Δ)denoting a location in the road networkG,vmeans the nearest node to l and Δ means the shortest road distance between v and l.Let two locations bels=(vs,Δs)andld=(vd,Δd)respectively, and the shortest road distance fromlstoldis computed by
3.2.2 Encrypted Exact Road Distance Computation Using FV Scheme
For locationsls=(vs,Δs)andld=(vd,Δd), the road distance between them can be computed in ciphertext domain by using FV Scheme.In a basic way, we can encrypt the respective coordinate vs=〈vs[0],...,vs[m-1]〉and vd=〈vd[0],...,vd[m-1]〉ofvsandvdbit-by-bit with the public keypkSHEto obtain two ciphertext sequences, denoted as:
Using the homomorphic property of FV Scheme, the encrypteddistE(vs,vd)can be calculated over [[vs]] and [[vd]] by:
Then, the encrypteddistE(ls,ld)can be computed by [[distE(ls,ld)]]=[[distE(vs,vd)]][[Δd]].
When the length ofmis long, the computation overhead of above basic distance computation method is heavy, since it is inefficient to encrypt/decrypt coordinate bit-by-bit.To reduce computation overhead, we propose an optimized approach with ciphertext packing to compute the shortest road distance efficiently.We now describe the details of two packed ciphertext constructions for the coordinates vsand vdas follows.
For the coordinate vs=〈vs[0],...,vs[m-1]〉, letfs(vs)represent the packed plaintext:
where vscan be converted into the polynomial coefficients offs(vs)by packing.The packed ciphertext of vsis calculated by encryptingfs(vs)as follows:
Given the coordinate vd=〈vd[0],...,vd[m-1]〉, let vdrepresent the packed plaintext:
where vdis converted into the polynomial coefficients offd(vd)by packing.The packed ciphertext of vdis calculated by encryptingfd(vd)as follows:
EncrypteddistE(vs,vd)can be computed by two packed ciphertexts denoted byandUsing multiplicative homomorphism, the ciphertext of Hamming weight of vsis calculated by the plaintext polynomial and the packed ciphertext, denoted as:
then we have
wherexn=-1(modxn+1).For plaintext modulustthat is large enough, the constant term in Eq.(19) equals the Hamming weight of vs.Likewise, the ciphertext of Hamming weight of vdis calculated by the plaintext polynomial and the packed ciphertext, denoted as:
then we have
The constant term in Eq.(21) equals the Hamming weight of vd.Using multiplicative homomorphism, the ciphertext of the inner product of coordinates vsand vdis calculated by two packed ciphertexts as follows:
then we have
The inner product of two coordinates vsand vdis equal to the constant term in Eq.(23).
Based on Eq.(9), we can use three packed ciphertexts (18), (20) and (22) to calculate the encrypteddistE(ls,ld)as follows:
Based on Eq.(10) and (24), the ciphertext of the distance between locationls=(vs,Δs)andld=(vd,Δd)as follows:
It needs only three multiplicative homomorphisms operations and four subtractive/additive homomorphisms operations for the calculation of the shortest road distance over two locations.
Our experiments are performed on the real road network of California, which consists of 21048 vertices and 21693 edges (www.cs.utah.edu/~lifeifei/SpatialDataset.htm).Following the assumptions made in [7], we need to delete some trivial edges and insert virtual vertices on edges with fixing the unit distance.For PHE, we use the Paillier cryptosystem library (acsc.cs.utexas.edu/libpaillier).For the modified Paillier cryptosystem, we setNandgto 1024 bits and 160 bits, respectively.For SHE, we use FV scheme built on FV-NFLlib(github.com/CryptoExperts/FV-NFLlib), and the degree of polynomials in FV schemenis set to 2048.All our experiments are conducted and executed on a PC running Ubuntu 18.04 LTS, with an Intel i7 processor at 3.4GHZ and 16GB RAM.
We evaluate the accuracy and the efficiency of our proposed road distance approaches in a k-Nearest Neighbors (kNN) query application [12,13].We first generate some locations on the edges of the road networks in a random fashion.Then, one location is randomly picked as the starting location, from which all road distances to other locations are computed by using our approaches.Finally, the closest location is selected as the nearest neighbor.Fig.2 depicts the accuracy of kNN query by using Euclidean distance, approximate road distance (dimensionω= 8,16,24,32) and exact road distance under different location scales.Euclidean distance is considered as the lower bound of accuracy, which always stays from 85% to 90%.We can see the accuracy of approximate road distance raises steadily as the dimension of the embedded road network increases.It is roughly 95% when the dimension is higher than 24.That is because higher dimension indicates higher accurate approximation.When we vary the location scale from 1000 to 4000, the accuracy of Euclidean distance gradually increases as the location scale increases,because larger location scale means there may exist closer destination locations located around the starting location.But the accuracy of Euclidean distance is still less than 90%.The accuracy of approximate road distance is always high under any driver scale, which is roughly 95% if the dimension is higher than 24.As expected, the accuracy of exact road distance keeps almost 100% under any location scale.Above experimental results demonstrate that both approximate road distance computation approach and exact road distance computation approach can reach a higher accuracy due to the choice of road distance.We use average online computation cost for per kNN query to evaluate the efficiency.As shown in Fig.3, the computation cost of the approximate road distance computation approach raises with the dimension increases.The reason is that higher dimension requires more encryption operations over a location coordinate.Meanwhile, the computation cost of both the approximate road distance computation approach and the exact road distance computation approach increase almost linearly as the location scale increases.That is because more distance computation is required with larger location scale.From above evaluation, we can see that the two approaches achieve high accuracy and efficiency.
Figure 2:The accuracy of Euclidean distance, approximate road distance and exact road distance
Figure 3:The accuracy of Euclidean distance, approximate road distance and exact road distance
Numerous protocols have been proposed for private shortest road distance computation in different applications, such as kNN query and navigation.For (yet related) privacy issues of distance computation, some approaches utilize structural anonymization [14], differential privacy [15–17]or Private Information Retrieval (PIR) [18] to guarantee privacy for the client or the server.However, these approaches suffer from limited privacy, performance or scalability.There is also a vast literature on privacy-preserving shortest distance computation in structured encryption [19]or graph encryption, which focuses on protecting graph data when outsourced to third-party servers or on the cloud [20–22].The most famous class of structured encryption schemes are searchable symmetric encryption (SSE) schemes [23].Generally speaking, SSE schemes usually encrypt indexes or search trees for the purpose of efficiently searching on encrypted data.Another line of work executing graph algorithms over encrypted graphs is to develop data-oblivious algorithms [24] or data structures [25].In these solutions, the graph data is stored in an Oblivious RAM (ORAM) [26] or an oblivious data structure on the server.The client can compute the shortest distances on the server without leaking its access patterns.Also relevant are the works based on SMC, such as Yao’s GCs and ORAM.The generic solution is to construct a GC that contains the entire graph structure for a shortest-path algorithm and apply Yao’s protocol.However, above approaches are often prohibitively expensive and impractical for city-scale road networks [27,28].For instance, the GC-based approach by Carter et al.[29,30] requires several minutes to compute a single shortest path in a road network with just 100 vertices.Another generic approach combining GCs and ORAM requires communication overhead on the order of GB and run-times ranging from tens of minutes to several hours for a single computation on a network with 1024 vertices.Recently, some schemes are proposed to support computation over large-scale encrypted graphs.Shen et al.[1] proposed a graph encryption scheme based on symmetric-key primitives and SHE, which enables approximate constrained shortest distance queries.Meng et al.[2] presented three schemes based on distance oracle and structured encryption for approximate shortest distance queries.Above two schemes provide an estimate on the shortest distance, along with sacrificing accuracy.Wang et al.[3] proposed a secure Graph Data Base (SecGDB) encryption scheme based on PHE and Yao’s GCs, which supports exact shortest distance/path queries.Wu et al.[4] proposed an efficient cryptographic protocol for fully-private navigation based on compressing the next-hop routing matrices, symmetric Private Information Retrieval (PIR) and Yao’s GCs, which requires about 1.5 s and less than 100 KB of bandwidth for each hop in city-scale road network.Compared with existing schemes, our two road distance computation approaches are more efficient to compute large-scale shortest distances in real time.
In this paper, we proposed two secure road distance computation approaches, which can compute road distance over encrypted data efficiently.An approximate road distance computation approach is designed by using Partially Homomorphic Encryption and road network set embedding.An exact road distance computation is built by using Somewhat Homomorphic Encryption and road network hypercube embedding.According to the evaluation over a real city-scale road network, we have verified that our approaches are accurate and efficient.
Funding Statement:This work was partially supported by National Natural Science Foundation of China (Grant Nos.61601146, 61732022), National Key R&D Program of China (Grant No.2016QY05X1000).
Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.
Computers Materials&Continua2021年12期