蒲金偉,高傾健,鄭欣,徐迎暉
SM4抗差分功耗分析輕量級(jí)門限實(shí)現(xiàn)
蒲金偉,高傾健,鄭欣*,徐迎暉
(廣東工業(yè)大學(xué) 自動(dòng)化學(xué)院,廣州 510006)( ? 通信作者電子郵箱xinzheng9209@gmail.com)
針對(duì)SM4門限實(shí)現(xiàn)(TI)面積大、隨機(jī)數(shù)消耗多的問題,提出一種SM4門限實(shí)現(xiàn)的改進(jìn)方案。在滿足門限實(shí)現(xiàn)理論的情況下,對(duì)S盒非線性求逆進(jìn)行了無隨機(jī)共享,并引入面向域的乘法掩碼方案,將S盒隨機(jī)數(shù)消耗減少至12 bit;基于流水線思想,設(shè)計(jì)了新的8 bit數(shù)據(jù)位寬的SM4串行體系結(jié)構(gòu),復(fù)用門限S盒,并優(yōu)化SM4線性函數(shù),使SM4門限實(shí)現(xiàn)面積更加緊湊,僅6 513 GE,相較于128 bit數(shù)據(jù)位寬的SM4門限實(shí)現(xiàn)方案,所提方案的面積減小了63.7%以上,并且更好地權(quán)衡了速度和面積。經(jīng)側(cè)信道檢驗(yàn),所提出的改進(jìn)方案具備抗一階差分功耗分析(DPA)能力。
SM4;差分功耗分析;門限實(shí)現(xiàn);S盒;非線性求逆;無隨機(jī)共享;面向域的乘法掩碼方案
SM4算法[1]是我國(guó)國(guó)家密碼管理局于2012年3月公布的第一個(gè)完全自主設(shè)計(jì)的商用對(duì)稱加密算法,是主要用于無線局域網(wǎng)和可信計(jì)算系統(tǒng)的專用分組密碼算法,在不同場(chǎng)景下對(duì)數(shù)據(jù)加密保護(hù),目前在金融交易、物聯(lián)網(wǎng)、通信等領(lǐng)域被廣泛應(yīng)用。因此SM4安全性的研究對(duì)我國(guó)密碼學(xué)發(fā)展、社會(huì)經(jīng)濟(jì)發(fā)展、信息安全等具有重要意義。
美國(guó)密碼學(xué)家Kocher[2]在20世紀(jì)90年代末首次提出了側(cè)信道分析(Side Channel Analysis, SCA),并成功對(duì)一些主流加密算法進(jìn)行了攻擊,SCA成為加密算法實(shí)現(xiàn)的最大的安全威脅之一。在密碼學(xué)領(lǐng)域?yàn)閾魯CA而開發(fā)的眾多對(duì)策中,Nikova等[3]和Bilgin等[4]提出的基于門限實(shí)現(xiàn)的對(duì)策無疑是當(dāng)今最流行的對(duì)策,該對(duì)策基于秘密共享的思想,使用多個(gè)隨機(jī)變量作用于一個(gè)中間值,即使在探針攻擊[3]的情況下,依然能保證安全性,并且具備抗差分功耗分析(Differential Power Analysis, DPA)的能力。自2006年以來,門限實(shí)現(xiàn)被廣泛應(yīng)用于應(yīng)用于高級(jí)加密標(biāo)準(zhǔn)(Advanced Encryption Standard,AES)、PRINCE、數(shù)據(jù)加密標(biāo)準(zhǔn)(Data Encryption Standard, DES)等國(guó)際加密算法中,有效降低了加密算法在側(cè)信道攻擊中的脆弱性,但針對(duì)SM4的門限實(shí)現(xiàn)方案較少。2007年,Liu等[5]給出了SM4算法S盒的代數(shù)表達(dá)式,奠定了SM4算法的研究基礎(chǔ);2014年,Liang等[6]利用正則基在復(fù)合域中實(shí)現(xiàn)求逆,提出了更緊湊的S盒;2018年,李新超等[7-8]構(gòu)造秘密共享函數(shù)代替仿射變換,提出了兩種基于門限實(shí)現(xiàn)的SM4 S盒結(jié)構(gòu),并具備抗一階、抗二階DPA的能力,但并未在實(shí)際應(yīng)用中進(jìn)行側(cè)信道安全分析;2022年,武小年等[9]基于多項(xiàng)式基設(shè)計(jì)了面積更緊湊的門限SM4方案,具備抗一階DPA能力。就目前來說,現(xiàn)有文獻(xiàn)主要關(guān)注SM4 S盒的門限實(shí)現(xiàn)設(shè)計(jì)方案,但S盒門限實(shí)現(xiàn)仍存在整體實(shí)現(xiàn)硬件開銷大、隨機(jī)數(shù)使用多等缺點(diǎn)。此外門限SM4均采用128 bit數(shù)據(jù)位寬的結(jié)構(gòu),存在S盒門限實(shí)現(xiàn)面積增大導(dǎo)致SM4整體實(shí)現(xiàn)面積顯著增加的缺點(diǎn),在門限AES實(shí)現(xiàn)中,均采用8 bit數(shù)據(jù)位寬的串行結(jié)構(gòu)以克服上述缺點(diǎn)。
本文基于正則基S盒結(jié)構(gòu),對(duì)復(fù)合域求逆均勻共享進(jìn)行優(yōu)化實(shí)現(xiàn),并引入面向域掩碼(Domain Oriented Masking, DOM)方案,設(shè)計(jì)更緊湊、隨機(jī)數(shù)更少的SM4門限S盒;基于8 bit數(shù)據(jù)位寬的串行SM4以及門限S盒結(jié)構(gòu),提出SM4輪函數(shù)中線性函數(shù)的優(yōu)化實(shí)現(xiàn),設(shè)計(jì)了新的基于流水線設(shè)計(jì)的門限SM4結(jié)構(gòu),正確實(shí)現(xiàn)SM4加密運(yùn)算;相較于文獻(xiàn)[7-9]采用的128 bit數(shù)據(jù)位寬的門限SM4,本文結(jié)構(gòu)中門限S盒的數(shù)量?jī)H為1,有效避免了門限S盒面積較大導(dǎo)致整體SM4面積大幅增加的問題,具有更好的時(shí)間-面積積(Area-Time Product, ATP)性能指標(biāo)。仿真結(jié)果表明,本文設(shè)計(jì)的SM4硬件結(jié)構(gòu)面積更加緊湊,所需隨機(jī)數(shù)更少,在一階DPA下具備安全性。
性質(zhì)1 正確性(Correctness)。異或所有共享分量值等于未共享變量值。
其中:為仿射矩陣;表示輸入的第 bit;為行向量。
在復(fù)合域進(jìn)行S盒計(jì)算時(shí),需要先進(jìn)行前仿射、同構(gòu)映射運(yùn)算,在設(shè)計(jì)中,兩個(gè)運(yùn)算進(jìn)行合并可提高計(jì)算效率,定義線性運(yùn)算LM如下:
在S盒復(fù)合域計(jì)算末尾,會(huì)進(jìn)行逆同構(gòu)映射、仿射運(yùn)算,同樣采用合并的方法進(jìn)行實(shí)現(xiàn),定義線性運(yùn)算ILM如下:
圖1描述了本文采用的復(fù)合域分解的S盒,根據(jù)各函數(shù)的線性和非線性,將它劃分為了5個(gè)部分,并且采用2輸入共享方案,由性質(zhì)2可知,可以抵御一階DPA。接下來以一種分階段的方式來介紹門限S盒設(shè)計(jì),其中每個(gè)階段由流水線寄存器進(jìn)行分隔。完整的門限S盒如圖2所示。
圖 2 SM4 S盒門限實(shí)現(xiàn)結(jié)構(gòu)
其中:
算法1 檢驗(yàn)掩蔽均勻性。
輸出 True,F(xiàn)alse。
end For
end For
Return False
end if
end For
end For
Return True
(19)
文獻(xiàn)[3-4]中所設(shè)計(jì)的門限AES電路結(jié)構(gòu),均采用8 bit數(shù)據(jù)位寬的串行AES結(jié)構(gòu),例化一個(gè)門限S盒模塊,按行逐字節(jié)加載明文和密鑰,進(jìn)而避免門限S盒面積過大導(dǎo)致整體AES面積大幅增加。本文結(jié)合文獻(xiàn)[3,15]的方法,并考慮SM4以及門限S盒的整體硬件結(jié)構(gòu),設(shè)計(jì)了基于門限S盒實(shí)現(xiàn)的SM4加密電路結(jié)構(gòu),如圖5所示。采用串行實(shí)現(xiàn)來進(jìn)行輪操作和密鑰調(diào)度,總體結(jié)構(gòu)包括2個(gè)狀態(tài)寄存器,1個(gè)密鑰寄存器、1個(gè)門限S盒、1個(gè)偽隨機(jī)數(shù)生成模塊。在數(shù)據(jù)路徑上8 bit和128 bit數(shù)據(jù)位寬,128 bit數(shù)據(jù)位寬僅用于加密開始前后的明文共享、密鑰、密文共享的讀寫。
圖5 SM4門限實(shí)現(xiàn)串行體系結(jié)構(gòu)
圖6 SM4狀態(tài)寄存器
為了對(duì)SM4門限實(shí)現(xiàn)的安全性進(jìn)行評(píng)估,本設(shè)計(jì)在SPARTAN-6 XC6SLX9 型號(hào)芯片的FPGA開發(fā)板中進(jìn)行實(shí)現(xiàn),使用PICO 3206D數(shù)字示波器采集電路加密時(shí)的實(shí)際功耗曲線,采樣頻率為250 MHz。在本設(shè)計(jì)中采用偽隨機(jī)數(shù)發(fā)生器(Pseudo-Random Number Generator,PRNG)為門限S盒提供隨機(jī)數(shù),因此分別采集關(guān)閉PRNG和開啟PRNG的1萬條功耗曲線進(jìn)行安全性對(duì)比,圖8展示了門限SM4加密前4輪的功耗曲線,可以觀察到由于PRNG的使用,圖8(b)的功耗略高于圖8(a)。
圖8 關(guān)閉與開啟PRNG的功耗曲線
首先,采用相關(guān)能量分析(Correlation Power Analysis, CPA)[17]方法對(duì)無保護(hù)和有保護(hù)的1萬條SM4第一輪加密功耗曲線進(jìn)行安全性分析。選擇對(duì)SM4第1字節(jié)密鑰進(jìn)行攻擊,攻擊結(jié)果如圖9所示。圖9(a)可以看出存在1個(gè)密鑰猜測(cè)值的相關(guān)系數(shù)遠(yuǎn)大于其他密鑰猜測(cè)值,表明無保護(hù)的SM4第1字節(jié)密鑰被成功攻擊,這是由于門限S盒每個(gè)階段不滿足均勻性從而導(dǎo)致了信息泄露。從圖9(b)可以看出所有密鑰猜測(cè)值的相關(guān)系數(shù)無明顯差異,表明有保護(hù)的門限SM4則無法通過CPA獲取密鑰。
圖9 關(guān)閉與開啟PRNG的CPA結(jié)果
圖10 關(guān)閉與開啟PRNG的一階檢驗(yàn)結(jié)果
使用Verilog語言實(shí)現(xiàn)本文基于門限實(shí)現(xiàn)的SM4方案,并在Xilinx ISE Design Suite上對(duì)方案的正確性進(jìn)行仿真驗(yàn)證,使用Synopsys Design Compiler工具以及SMIC 55 nm工藝庫進(jìn)行邏輯綜合,與現(xiàn)有的一些掩碼方案對(duì)比如表1所示。
表1 SM4門限實(shí)現(xiàn)設(shè)計(jì)比較
由表1可知在相同條件下,與文獻(xiàn)[7-9]采用128 bit數(shù)據(jù)位寬實(shí)現(xiàn)的門限SM4相比,本文方案面積減小約63.7%~75.7%,門限S盒消耗的隨機(jī)掩碼數(shù)量?jī)H為12 bit,遠(yuǎn)低于文獻(xiàn)[7-8]方案中S盒的隨機(jī)掩碼消耗數(shù)量,但SM4門限實(shí)現(xiàn)完成整個(gè)加密過程所需時(shí)鐘數(shù)增加了52.1%~82.5%。此外,本文采用ATP指標(biāo)對(duì)不同門限SM4方案進(jìn)行對(duì)比。假設(shè)時(shí)鐘頻率均為50 MHz,文獻(xiàn)[7-9]方案的ATP指標(biāo)分別為89.60、84.40、59.86,本文ATP指標(biāo)為38.00,更好地權(quán)衡了面積和性能。本文也給出了門限SM4各模塊的面積,如表2所示。因此,本文8 bit數(shù)據(jù)位寬的門限SM4具有更好的性能,面積更加緊湊,門限S盒所需隨機(jī)數(shù)更少。
表2 SM4門限實(shí)現(xiàn)模塊面積
目前SCA手段不斷更新,如高階探針攻擊、DPA。因而,具備抵御高階SCA能力的密碼算法實(shí)現(xiàn)研究依舊是一個(gè)研究重點(diǎn)。本文采用2共享方案,因此僅能抵御一階DPA,能夠抵御高階DPA的SM4算法實(shí)現(xiàn)是下一階段的研究工作所重點(diǎn)追求的目標(biāo)。
[1] Office of State Commercial Cipher Administration. Block cipher for WLAN products — SMS4 [EB/OL]. [2022-03-21]. http://www.oscca.gov.cn/UpFile/2006021016423197990.pdf.
[2] KOCHER P C. Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems[C]// Proceedings of the 1996 Annual International Cryptology Conference, LNCS 1109. Cham: Springer, 1996: 104-113.
[3] NIKOVA S, RECHBERGER C, RIJMEN V. Threshold implementations against side-channel attacks and glitches[C]// Proceedings of the 2006 International Conference on Information and Communications Security, LNCS 4307. Cham: Springer, 2006: 529-545.
[4] BILGIN B, GIERLICHS B, NIKOVA S, et al. Trade-offs for threshold implementations illustrated on AES[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2015, 34(7): 1188-1200.
[5] LIU F, JI W, HU L, et al. Analysis of the SMS4 block cipher[C]// Proceedings of the 2007 Australasian Conference on Information Security and Privacy, LNCS 4586. Berlin: Springer, 2007: 158-170.
[6] LIANG H, WU L J, ZHANG X M, et al. Design of a masked S-box for SM4 based on composite field[C]// Proceedings of the 10th International Conference on Computational Intelligence and Security. Piscataway: IEEE, 2014: 387-391.
[7] 李新超,鐘衛(wèi)東,張帥偉,等.一種基于門限實(shí)現(xiàn)的SM4算法S盒實(shí)現(xiàn)方案[J].計(jì)算機(jī)工程與應(yīng)用,2018,54(17):83-88. (LI X C, ZHONG W D, ZHANG S W, et al. New S-box of SM4 based on threshold implementation [J]. Computer Engineering and Applications, 2018, 54(17): 83-88.)
[8] 李新超,鐘衛(wèi)東,張帥偉,等.一種SM4算法S盒的門限實(shí)現(xiàn)方案[J].密碼學(xué)報(bào),2018,5(6):641-650.(LI X C, ZHONG W D, ZHANG S W, et al. A new threshold implementation of the S-box in SM4[J]. Journal of Cryptologic Research, 2018, 5(6): 641-650.)
[9] 武小年,李金林,潘晟,等.SM4算法門限掩碼方案設(shè)計(jì)與實(shí)現(xiàn)[J].計(jì)算機(jī)應(yīng)用研究,2022,39(2):572-576.(WU X N, LI J L, PAN S, et al. Threshold masking schema design and implementation on SM4 algorithm[J]. Application Research of Computers.2022,39(2): 572-576.)
[10] NIKOVA S, RIJMEN V, SCHLAFFER M. Secure hardware implementation of nonlinear functions in the presence of glitches[J]. Journal of Cryptology, 2011, 24(2): 292-321.
[11] BILGIN B, GIERLICHS B, NIKOVA S, et al. A more efficient AES threshold implementation[C]// Proceedings of the 2014 International Conference on Cryptology in Africa, LNCS 8469. Cham: Springer, 2014: 267-284.
[12] DE CNUDDE T, REPARAZ O, BILGIN B, et al. Masking AES with+1 shares in hardware[C]// Proceedings of the 2016 International Conference on Cryptographic Hardware and Embedded Systems, LNCS 9813. Berlin: Springer, 2016: 194-212.
[13] WEI M, SUN S W, WEI Z H, et al. A small first-order DPA resistant AES implementation with no fresh randomness [J]. Science China Information Sciences, 2022, 65(6): No.169102.
[14] GROSS H, MANGARD S, KORAK T. Domain-oriented masking: compact masked hardware implementations with arbitrary protection order [C]// Proceedings of the 2016 ACM Workshop on Theory of Implementation Security. New York: ACM, 2016: 3.
[15] SHANG M, ZHANG Q L, LIU Z B, et al. An ultra-compact hardware implementation of SMS4[C]// Proceedings of the 2014 IIAI 3rd International Conference on Advanced Applied Informatics. Piscataway: IEEE, 2014:86-90.
[16] 鄭朝霞,資義純,吳旭峰,等.SMS4算法串行化設(shè)計(jì)及其輕量級(jí)電路實(shí)現(xiàn)[J].華中科技大學(xué)學(xué)報(bào)(自然科學(xué)版),2016,44(2):61-64.(ZHENG Z X, ZI Y C, WU X F, et al. Serialized design of SMS4 and lightweight implement[J]. Journal of Huazhong University of Science and Technology (Nature Science Edition), 2016, 44(2): 61-64.)
[17] BRIER E, CLAVIER C, OLIVIER F. Correlation power analysis with a leakage model [C]// Proceedings of the 6th International Workshop on Cryptographic Hardware and Embedded Systems, LNCS 3156. Berlin: Springer, 2004: 16-29.
[18] ROY D B, BHASIN S, PATRANABIS S, et al. Testing of side-channel leakage of cryptographic intellectual properties: metrics and evaluations[M]// Hardware IP Security and Trust. Cham: Springer, 2017: 99-131.
SM4 resistant differential power analysis lightweight threshold implementation
PU Jinwei, GAO Qingjian, ZHENG Xin*, XU Yinghui
(,,510006,)
Aiming at the problems of large area and large consumption of fresh randomness in Threshold Implementation (TI) of SM4, an improved threshold implementation scheme of SM4 was proposed. In the case of satisfying the threshold implementation theory, the operation of S-box nonlinear inversion was shared with no fresh randomness, and a domain-oriented multiplication mask scheme was introduced to reduce the fresh randomness consumption of S-box to 12 bits. Based on the idea of the pipeline, a new SM4 serial architecture with 8-bit data width was designed. The threshold implementation of S-box was reused, and the linear function of SM4 was optimized to make the area of threshold implementation of SM4 more compact, only 6 513 GE. In comparison with the TI scheme of SM4 with 128-bit data width, the area of the proposed scheme is reduced by more than 63.7%, and there is a better trade-off between speed and area. The side-channel experimental results show that the proposed scheme has the capability of anti-first-order Differential Power Analysis (DPA).
SM4; Differential Power Analysis (DPA); Threshold Implementation (TI); S-box; nonlinear inversion; shared with no fresh randomness; domain-oriented multiplication mask scheme
1001-9081(2023)11-3490-07
10.11772/j.issn.1001-9081.2022101579
2022?10?24;
2022?12?29;
廣東省基礎(chǔ)與應(yīng)用基礎(chǔ)研究基金資助項(xiàng)目(2021A1515110777)。
蒲金偉(1998—),男,重慶人,碩士研究生,主要研究方向:密碼算法側(cè)信道防護(hù); 高傾?。?997—),男,廣東普寧人,碩士研究生,主要研究方向:密碼算法側(cè)信道防護(hù); 鄭欣(1993—),女,湖北咸寧人,博士,主要研究方向:SoC設(shè)計(jì)、軟硬件協(xié)同設(shè)計(jì)、圖神經(jīng)網(wǎng)絡(luò); 徐迎暉(1977—),男,湖南長(zhǎng)沙人,副教授,博士,主要研究方向:信息安全、嵌入式系統(tǒng)、多媒體信號(hào)處理。
TP309.2
A
2023?01?03。
This work is partially supported by Guangdong Basic and Applied Basic Research Foundation (2021A1515110777).
PU Jinwei, born in 1998, M. S. candidate. His research interests include cryptographic algorithm side-channel protection.
GAO Qingjian, born in 1997, M. S. candidate. His research interests include cryptographic algorithm side-channel protection.
ZHENG Xin, born in 1993, Ph. D. Her research interests include SoC (System on Chip) design, software and hardware co-design, graph neural network.
XU Yinghui, born in 1977, Ph. D., associate professor. His research interests include information security, embedded systems, multimedia signal processing.