Protecting privacy in data release (Q890108): Difference between revisions
From MaRDI portal
Created a new Item |
Added link to MaRDI item. |
||
links / mardi / name | links / mardi / name | ||
Revision as of 16:02, 30 January 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Protecting privacy in data release |
scientific article |
Statements
Protecting privacy in data release (English)
0 references
9 November 2015
0 references
Databases have become a common presence in our today's informatized world. And somewhat indispensable. Hence database access is a common practice. The problems are how safe this access (read/write) is and how the information involved in a release can be protected. The collection of data may contain sensitive information to be protected (e.g. about diseases of patients) or non-sensitive information from which a malicious user can extract confidential data (very nice example of troops localized in an area from which, using their age data and through comparison, the headquarters' position can be detected). The book gives a formal presentation of these problems and offers interesting solutions by the means of its approach. There are four main chapters in the book. The general problem of data and user security is treated in Chapter 2. Using some new definitions for describing the theoretical concept of ``syntactic data protection'' (e.g. differential privacy -- basic and extended forms), the author shows the latest known results concerning protection techniques, based on syntactic, semantic and data fragmentation. Some approaches, like selective encryption, policy updates, and attribute-based encryption, are shortly presented as alternative solutions for access control enforcement. Chapter 3, the most extensive of the book (it is based on texts from the papers [\textit{V. Ciriani} et al., ``An OBDD approach to enforce confidentiality and visibility constraints in data publishing'', J. Comput. Secur. 20, No. 5, 463--508 (2012); \textit{S. De Capitani di Vimercati} et al., ``Loose associations to increase utility in data publishing'', ibid. 23, No. 1, 59--88 (2015)]), proposes a theoretical model for protecting sensitive information without reducing the visibility requirements of data access. The model is constructed with sets of Boolean formulas, represented through reduced and ordered binary decision diagrams (OBDDs), modeled using oriented graphs. In this context, fragmentation is defined (decomposition of information into attributed components), formally described by calculating the maximum weighted clique over fragments modeled by graphs, action that preserves confidentiality requirements and data visibility. Since the calculation of a minimum set of truth assignments is an NP-hard problem, the author reduces the complexity of the construction using a heuristic algorithm that computes a locally minimum set of truth assignments. As publishing information by various fragments of data (local associations) may lead to exposure of sensitive information, the notion of ``loose association'' is introduced, and heuristics for their determination are developed. The theoretical results obtained are experimentally evaluated on a model written in Python and the efficiency of these heuristics is analyzed in terms of the utility offered by the loose associations. Chapter 4, built on the paper [\textit{M. Bezzi} et al., ``Modeling and preventing inferences from sensitive value distributions in data release'', J. Comput. Secur. 20, No. 4, 393--436 (2012)], discusses the case when data disclosure from databases may lead through inference to finding out sensitive information (even if it is not explicitly disclosed). The mathematical model is based on statistical evaluations (somehow singular in the theoretical background used by the book), whose distributions of released data (based on certain metrics) result in confidential information being inferred from databases. The author defines a new model for inference and describes two possible strategies to deduce sensitive data. Strategies are implemented using four statistical tests and experimented with a MatLab prototype. The results are compared using different metrics, for each of them defining a model of safe releasing against certain types of attacks. The case study used in this chapter is an example from the military field (previous chapters have used examples from medical or educational areas). In Chapter 5, the subject is almost completely different. Namely, how to uniquely treat read/write rights of users within different scenarios, such that the role of external resources (server, manager, outsourcing) is minimally invasive. The solution proposed by the author (in fact, the authors of the papers [\textit{S. De Capitani di Vimercati} et al., ``Enforcing dynamic write privileges in data outsourcing'', Comput. Secur. 39, Part A, 47--63 (2013; \url{doi:10.1016/j.cose.2013.01.008}); ``Enforcing subscription-based authorisation policies in cloud scenarios'', Lect. Notes Comput. Sci. 7371, 314--329 (2012; Zbl 1329.68097)]) is based on symmetric encryption (defining a data structure called set-based key derivation graph) that ensures an enforced control of user read/write rights. This concept is concluded with a subscription-based policy that provides users with read access (subscription-based) to certain data -- existing from the subscription period -- after the expiry of the user's rights. All the constructions proposed work dynamically, allowing the update of access structures once new users arrive or new information is stored in the database. The study here is mostly theoretical, much closer to the style of Chapters 2 and 3. Chapter 6 summarizes the results and proposes several directions of work. The book is an edited version of the author's PhD Thesis. This can be a justification why there is only one author although all five papers inserted (with the editors' permission) are written by several researchers, including the author.
0 references
data release
0 references
data policy
0 references
data protection
0 references
graph modelling
0 references