Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata
From MaRDI portal
Abstract: Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect the confidentiality of respondents. There is a need for valid and practical ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping (random and targeted) and the post randomization method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. It is found that the misclassification dominates the usual monotone increasing relationship between this number and risk so that the risk eventually declines, implying less sensitivity of risk to choice of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other nonsampling errors commonly arising in surveys.
Recommendations
- Disclosure risk assessment in statistical data protection.
- A Measure of Disclosure Risk for Microdata
- Estimating Risks of Identification Disclosure in Microdata
- Data confidentiality: a review of methods for statistical disclosure limitation and methods for assessing privacy
- Statistical methods for some simple disclosure limitation rules
Cites work
- scientific article; zbMATH DE number 2087735 (Why is no real title available?)
- scientific article; zbMATH DE number 3297798 (Why is no real title available?)
- Admissibility of the natural estimator of the mean of a Gaussian process
- Assessing identification risk in survey microdata using log-linear models
- Data-swapping: A technique for disclosure control
- Elements of statistical disclosure control
- Estimating Risks of Identification Disclosure in Microdata
- On the Barcilon formula for the string equation with a piecewise continuous density function
- Privacy and confidentiality in an e-commerce world: data mining, data warehousing, matching and disclosure limitation
- The post randomisation method for protecting microdata
Cited in
(12)- Post-randomization for controlling identification risk in releasing microdata from general surveys
- An information theoretic approach to post randomization methods under differential privacy
- Estimating identification disclosure risk using mixed membership models
- Small area estimation with covariates perturbed for disclosure limitation
- scientific article; zbMATH DE number 2156343 (Why is no real title available?)
- scientific article; zbMATH DE number 2087735 (Why is no real title available?)
- Data privacy using an evolutionary algorithm for invariant PRAM matrices
- Measuring Identification Risk in Microdata Release and Its Control by Post‐randomisation
- On Invariant Post‐randomization for Statistical Disclosure Control
- Data confidentiality: a review of methods for statistical disclosure limitation and methods for assessing privacy
- A post-randomization method for rigorous identification risk control in releasing microdata
- scientific article; zbMATH DE number 7387533 (Why is no real title available?)
This page was built for publication: Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q614142)