BDPL: A Boundary Differentially Private Layer Against Machine Learning Model Extraction Attacks

Zheng, Huadi; Ye, Qingqing; Hu, Haibo; Fang, Chengfang; Shi, Jie

doi:10.1007/978-3-030-29959-0_4

Huadi Zheng¹¹,
Qingqing Ye^11,12,
Haibo Hu¹¹,
Chengfang Fang¹³ &
…
Jie Shi¹³

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11735))

Included in the following conference series:

European Symposium on Research in Computer Security

3575 Accesses
28 Citations

Abstract

Machine learning models trained by large volume of proprietary data and intensive computational resources are valuable assets of their owners, who merchandise these models to third-party users through prediction service API. However, existing literature shows that model parameters are vulnerable to extraction attacks which accumulate a large number of prediction queries and their responses to train a replica model. As countermeasures, researchers have proposed to reduce the rich API output, such as hiding the precise confidence level of the prediction response. Nonetheless, even with response being only one bit, an adversary can still exploit fine-tuned queries with differential property to infer the decision boundary of the underlying model. In this paper, we propose boundary differential privacy (\(\epsilon \)-BDP) as a solution to protect against such attacks by obfuscating the prediction responses near the decision boundary. \(\epsilon \)-BDP guarantees an adversary cannot learn the decision boundary by a predefined precision no matter how many queries are issued to the prediction API. We design and prove a perturbation algorithm called boundary randomized response that can achieve \(\epsilon \)-BDP. The effectiveness and high utility of our solution against model extraction attacks are verified by extensive experiments on both linear and non-linear models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: EUR 29.95; Price includes VAT (India)

eBook: EUR 42.79; Price includes VAT (India)

Softcover Book: EUR 49.99; Price excludes VAT (India)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In general, this notation can be any distance metrics (e.g., Manhattan distance, Euclidean distance). The implications of distance metrics to detailed algorithms will be discussed in Sect. 4.1.
2.
The white-box assumption is based on the fact that state-of-the-art models in specific application domains, such as image classification, are usually public knowledge. Nonetheless, our solution can also work against black-box attacks where such knowledge is proprietary.
3.
The case of tangency is rarely reached in real life given that the feature space is usually continuous. For simplicity, we mainly consider intersection.
4.
If \(\varDelta \) is small, the decision boundary near the ball can be treated as a hyperplane.
5.
To do this, we start with 1 random flip out of all responses and measure its overall extraction rate. We then repeatedly increment this number by 1 until the overall extraction rate is very close to that of BDPL.

References

Abadi, M., Agarwal, A., Barham, P., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/, software available from tensorflow.org
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318 (2016)
Google Scholar
Angluin, D.: Queries and concept learning. Mach. Learn. 2, 319–342 (1987)
MathSciNet Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Duchi, J.C., Jordan, M.I., Wainwright, M.J.: Local privacy and statistical minimax rates. In: IEEE Symposium on Foundations of Computer Science, pp. 429–438 (2013)
Google Scholar
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Chapter Google Scholar
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333 (2015)
Google Scholar
Harris, D.M., Harris, S.L.: Digital design and computer architecture (2007)
Chapter Google Scholar
Juuti, M., Szyller, S., Dmitrenko, A., Marchal, S., Asokan, N.: Prada: Protecting against DNN model stealing attacks. CoRR abs/1805.02628 (2018)
Google Scholar
Kesarwani, M., Mukhoty, B., Arya, V., Mehta, S.: Model extraction warning in MLAAS paradigm. In: Annual Computer Security Applications Conference (2018)
Google Scholar
Lee, J., Kifer, D.: Concentrated differentially private gradient descent with adaptive per-iteration privacy budget. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2018)
Google Scholar
Lee, T., Edwards, B., Molloy, I., Su, D.: Defending against model stealing attacks using deceptive perturbations. CoRR abs/1806.00054 (2018)
Google Scholar
Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. KDD 2005, pp. 641–647. ACM (2005)
Google Scholar
Oh, S.J., Augustin, M., Schiele, B., Fritz, M.: Towards reverse-engineering black-box neural networks. In: International Conference on Learning Representations (2018)
Google Scholar
Orekondy, T., Schiele, B., Fritz, M.: Knockoff nets: stealing functionality of black-box models. CoRR abs/1812.02766 (2018)
Google Scholar
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519 (2017)
Google Scholar
Quiring, E., Arp, D., Rieck, K.: Forgotten siblings: Unifying attacks on machine learning and digital watermarking. In: IEEE European Symposium on Security and Privacy (EuroS&P), pp. 488–502 (2018)
Google Scholar
Shokri, R., Stronati, M., Shmatikov, V.: Membership inference attacks against machine learning models. In: IEEE Symposium on Security and Privacy, pp. 3–18 (2017)
Google Scholar
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIS. In: Proceedings of the 25th USENIX Conference on Security Symposium, pp. 601–618 (2016)
Google Scholar
Valiant, L.G.: A theory of the learnable. In: ACM Symposium on Theory of Computing (1984)
Google Scholar
Wang, B., Gong, N.Z.: Stealing hyperparameters in machine learning. In: IEEE Symposium on Security and Privacy, pp. 36–52 (2018)
Google Scholar
Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60(309), 63–69 (1965)
Article Google Scholar
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers: a case study on PDF malware classifiers. In: Annual Network and Distributed System Security Symposium (2016)
Google Scholar

Download references

Acknowledgement

This work was supported by National Natural Science Foundation of China (Grant No: 61572413, U1636205, 91646203, 61532010, 91846 204, and 61532016), the Research Grants Council, Hong Kong SAR, China (Grant No: 15238116, 15222118 and C1008-16G), and a research grant from Huawei Technologies.

Author information

Authors and Affiliations

The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China
Huadi Zheng, Qingqing Ye & Haibo Hu
Renmin University of China, Beijing, China
Qingqing Ye
Huawei International, Shanghai, China
Chengfang Fang & Jie Shi

Authors

Huadi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Qingqing Ye
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Hu
View author publications
You can also search for this author in PubMed Google Scholar
Chengfang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huadi Zheng .

Editor information

Editors and Affiliations

NEC Corporation, Kawasaki, Japan
Kazue Sako
University of Surrey, Guildford, UK
Steve Schneider
University of Luxembourg, Esch-sur-Alzette, Luxembourg
Peter Y. A. Ryan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, H., Ye, Q., Hu, H., Fang, C., Shi, J. (2019). BDPL: A Boundary Differentially Private Layer Against Machine Learning Model Extraction Attacks. In: Sako, K., Schneider, S., Ryan, P. (eds) Computer Security – ESORICS 2019. ESORICS 2019. Lecture Notes in Computer Science(), vol 11735. Springer, Cham. https://doi.org/10.1007/978-3-030-29959-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-29959-0_4
Published: 15 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29958-3
Online ISBN: 978-3-030-29959-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics