LegoDB Project
The LegoDB XML-to-Relational Mapping Engine
LegoDB is a cost-based XML storage mapping engine
that explores a space of possible XML-to-relational mappings and
selects the best mapping for a given application. LegoDB leverages
current XML and relational technologies: 1) it models the target
application with an XML Schema, XML data statistics, and an XQuery
workload; 2) the space of configurations is generated through
XML-Schema rewritings; and 3) the best among the derived
configurations is selected using cost estimates obtained through a
standard relational optimizer. Experiments show that
the LegoDB mapping engine is very effective in practice and can lead
to reductions of over 50% in the running times of queries compared to
previous mapping techniques.
StatiX: Making XML Count
The availability of summary data for XML documents has many
applications, from providing users with quick feedback about their
queries, to cost-based storage design and query optimization. StatiX
is a novel XML Schema-aware statistics framework that exploits the
structure derived by regular expressions (which define elements in an
XML Schema) to pinpoint places in the schema that are likely sources
of structural skew, and builds concise, yet accurate, statistical
summaries for XML data. StatiX leverages standard XML technology for
gathering statistics, notably XML Schema validators, and it uses
histograms to summarize both the structure and values in an XML
document.
External Publications and Presentations
LegoDB: Customizing Relational Storage for XML Documents
(by Philip Bohannon, Juliana Freire, Jayant Haritsa, Maya Ramanath, Prasan Roy and Jerome Simeon)
Demo to appear in VLDB 2002.
StatiX: Making XML Count
(pdf version)
(by Juliana Freire, Jayant Haritsa, Maya Ramanath, Prasan Roy and
Jerome Simeon)
Proceedings of SIGMOD 2002.
From XML Schema to Relations: A Cost-Based Approach to XML Storage
(by Philip Bohannon, Juliana Freire, Prasan Roy and Jerome
Simeon).
Proceedings of ICDE 2002.
Presentation at Dagstuhl Workshop on Foundations of Semi-Structured Data, September, 2001 (pdf)
(by Juliana Freire)
People
Created by juliana@research.bell-labs.com
Last modified: Fri Jun 28 13:50:47 EDT 2002