Model-checking based data retrieval. An application to semistructured and temporal data. (Q1880661)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Model-checking based data retrieval. An application to semistructured and temporal data. |
scientific article |
Statements
Model-checking based data retrieval. An application to semistructured and temporal data. (English)
0 references
30 September 2004
0 references
The volume contains the results of the research covered by the Ph.D. dissertation of Elisa Quintarelli at the Politecnico di Milano. Two specific problems in the area of representation of semistructured data and queries are addressed -- the study and definition of a semantics for a graph-based query language, and the usage of model-checking based techniques to efficiently solve graphical queries on static and dynamic semistructured documents. In particular, methods for searching on the Web are considered. An effective graph-based approach to the formalization of query languages for semistructured and temporal information is introduced supporting both data structure variability and different topological similarities between queries and document structures. This offers a powerful and flexible paradigm for query execution based on matching the query graph with the database instance graph. Static and temporal queries are naturally expressed by labeled graphs. One of the main contributions is the design of a semantics for the \(G\)-log language based on the notion of bisimulation. It allows different similarity levels to be supported, but in general it is computationally so complex that it is not applicable to real-life-size problems. In order to overcome this limiting but crucial issue, the author observes that the graphical queries can be easily translated into logic formulae. It is possible to associate a modal logic formula \(F\) to a query and to interpret database instance graphs as Kripke Transition Systems (KTS). In this way the problem of solving a graphical query with respect to a semistructured database is reduced to an instance of the model-checking problem. A modal logic with the same syntax as the temporal logic CTL is used; finding subgraphs of the database instance graph that match the query can be performed by finding nodes of the KTS derived from the database instance graph that satisfy the formula \(F\). Then the problem is decidable and algorithms running in linear time on both the sizes of the KTS and the formula can be employed. In this work a family of graph-based queries that represent CTL formulae is identified and the expressive limits of this propositional temporal logic used as a query language are discussed. An immediate consequence of the translation of queries to modal formulae is that an effective procedure for efficiently querying semistructured databases can be directly implemented on a model checker. The model checking technique for data retrieval could also be applied to nongraphical query languages. The approach was effectively tested by using a model checker. Then the technique introduced was adapted for accessing static information stored in graphical databases to retrieve temporal aspects of semistructured data as well. The main concepts of classical temporal databases can be applied and adapted to semistructured databases. Several dynamic aspects of semistructured data presentation based on different uses of the concept of ``time interval'' were studied. In this work also another time dimension -- interaction time (valid time relative to user browsing) -- is studied. For customising the presentation of information it is very important to capture the history of interactions between users and the applications they browse. Interaction time is a somehow novel field of discussion. The corresponding factors are deeply studied in a research area called Web Mining. It is shown that a general temporal model can be adapted to keep trace of usage patterns while navigating web sites. The analysis of how users are accessing a site is critical for optimizing the structure of the web site and can be placed among the pre-processing activities required for mining before the mining algorithms can be run. Historical analysis can be successively used for improving and customizing content presentation. The thesis presents research containing novel contributions and an in-depth expertise on several topics. It represents a valuable source of information.
0 references
semistructured data
0 references
temporal data
0 references
model-checking based data retrieval
0 references