李靈芳 黃文培 胡偉健
摘要:差分隱私保護是Dwork提出的基于數(shù)據(jù)失真技術的一種新的隱私保護模型,由于其克服了傳統(tǒng)隱私保護需要背景知識假設和無法定量分析隱私保護水平的缺點,近年來迅速成為隱私保護領域研究熱點。PINQ是最早實現(xiàn)差分隱私保護的交互型原型系統(tǒng)。介紹了差分隱私保護相關理論基礎,分析了PINQ框架的實現(xiàn)機制。以PINQ中差分隱私保護下K-means聚類實現(xiàn)為例,研究了差分隱私在聚類中的應用。仿真實驗表明,在不同的隱私預算下,實現(xiàn)的隱私保護級別也不同。
關鍵詞:K-means; 數(shù)據(jù)失真;差分隱私; PINQ
DOIDOI:10.11907/rjdk.161175
中圖分類號:TP309文獻標識碼:A文章編號:1672-7800(2016)006-0204-05
參考文獻:
[1]周水庚, 李豐, 陶宇飛,等.面向數(shù)據(jù)庫應用的隱私保護研究綜述[J]. 計算機學報, 2009, 32(5):847-861.
[2]李楊, 溫雯, 謝光強. 差分隱私保護研究綜述[J].計算機應用研究, 2012, 29(9):3201-3205.
[3]MCSHERRY F. Privacy integrated queries[C].In Proc. ACM SIGMOD International Conference on Management of Data,2009.
[4]MOHAN P, THAKURTA A, SHI E, et al. GUPT:privacy preserving data analysis made easy[C].Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM,2012:349-360.
[5]ROY I, SETTY S T V, KILZER A, et al. Airavat:security and privacy for mapreduce[J]. Usenix Org, 2010:297-312.
[6]DWORK C. A firm foundation for private data analysis[J]. Communications of the Acm, 2011, 54(1):86-95.
[7]DWORK C, MCSHERRY F, NISSIM K, et al. Calibrating noise to sensitivity in private data analysis[M]. Theory of Cryptography,Springer Berlin Heidelberg, 2006:265-284.
[8]FRIEDMAN A, SCHUSTER A. Data mining with differential privacy[C].Acm Sigkdd International Conference on Knowledge Discovery & Data Mining,2010:493-502.
[9]MCSHERRY F D. Privacy integrated queries: an extensible platform for privacy-preserving data analysis[J]. Proc,2011(1):26-30.
[10]BLUM A, DWORK C, MCSHERRY F, et al. Practical privacy: the sulq framework[J]. In PODS 05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2005(6):128-138.
[11]DWORK C. A firm foundation for private data analysis[J]. Communications of the Acm, 2011, 54(1):86-95.
[12]STEINBACH M, GEORGE. Karypis and vipin kumar 2000, a comparison of document clustering techniques[J]. Kdd Workshop on Text Mining, 2000(3):123-130.
[13]李楊, 郝志峰, 溫雯,等. 差分隱私保護k-means聚類方法研究[J]. 計算機科學, 2013, 40(3):287-290.
[14]張嘯劍, 王淼, 孟小峰. 差分隱私保護下一種精確挖掘top-k頻繁模式方法[C].第30屆中國數(shù)據(jù)庫學術會議, 2013.
[15]熊平, 朱天清, 金大衛(wèi). 一種面向決策樹構建的差分隱私保護算法[J]. 計算機應用研究, 2014, 31(10):3108-3112.
[16]ANIL K,JAIN. Data clustering: 50 years beyond K-means [J]. Pattern Recognition Letters, 2010, 31(8):651-666.