Privacy-preserving k-means clustering across multiple parties

Speaker:

Florian Kerschbaum, University of Waterloo

Date and Time:

Friday, April 25, 2025 - 4:00pm to 5:00pm

Location:

Fields Institute, Room 309, Stewart Library

Abstract:

K-means clustering is a widely used unsupervised machine learning technique for discovering patterns in data. However, in many practical scenarios, the data is distributed across multiple parties and contains sensitive information, raising significant privacy concerns. In this talk, I will present two recent advances in privacy-preserving k-means clustering: one tailored for horizontally partitioned data and another for vertically partitioned data. We focus on two complementary notions of privacy: Input secrecy, ensuring that nothing beyond the final cluster centroids is revealed about the private inputs. Output privacy, ensuring that even the released centroids leak only a limited amount of information about the input data. Our protocols achieve the strongest known privacy guarantees while also being highly efficient. Remarkably, we improve performance by up to six orders of magnitude compared to the current state-of-the-art. This efficiency boost is achieved by combining secure multi-party computation (for input secrecy) with differential privacy (for output privacy), demonstrating that differentially private k-means can be securely computed more efficiently than exact k-means.

The Fields Institute for
Research in Mathematical Sciences

Privacy-preserving k-means clustering across multiple parties

Scheduled as part of

People and Contacts

Calendar and Events