Providing accurate models across private partitioned data: secure maximum likelihood estimation
From MaRDI portal
Publication:1624813
Abstract: This paper focuses on the privacy paradigm of providing access to researchers to remotely carry out analyses on sensitive data stored behind firewalls. We address the situation where the analysis demands data from multiple physically separate databases which cannot be combined. Motivating this problem are analyses using multiple data sources that currently are only possible through extension work creating a trusted user network. We develop and demonstrate a method for accurate calculation of the multivariate normal likelihood equation, for a set of parameters given the partitioned data, which can then be maximized to obtain estimates. These estimates are achieved without sharing any data or any true intermediate statistics of the data across firewalls. We show that under a certain set of assumptions our method for estimation across these partitions achieves identical results as estimation with the full data. Privacy is maintained by adding noise at each partition. This ensures each party receives noisy statistics, such that the noise cannot be removed until the last step to obtain a single value, the true total log-likelihood. Potential applications include all methods utilizing parameter estimation through maximizing the multivariate normal likelihood equation. We give detailed algorithms, along with available software, and both a real data example and simulations estimating structural equation models (SEMs) with partitioned data.
Recommendations
Cites work
- scientific article; zbMATH DE number 5009196 (Why is no real title available?)
- scientific article; zbMATH DE number 1294360 (Why is no real title available?)
- scientific article; zbMATH DE number 1304673 (Why is no real title available?)
- scientific article; zbMATH DE number 1759770 (Why is no real title available?)
- Differential Privacy: A Survey of Results
- Elements of statistical disclosure control
- OpenMX 2.0: extended structural equation and statistical modeling
- Privacy-Preserving Set Operations
- Secure computation with horizontally partitioned data using adaptive regression splines
Cited in
(4)
This page was built for publication: Providing accurate models across private partitioned data: secure maximum likelihood estimation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1624813)