Computing Publications

Publications Home » Integrating Unnormalised Semi-Str...

Integrating Unnormalised Semi-Structured Data Sources

Sasivimol Kittivoravitkul, Peter McBrien

Conference or Workshop Paper
The 17th Conference on Advanced Information Systems Engineering (CAiSE'05), Porto, Portugal
Lecture Notes in Computer Science
Lecture Notes in Computer Science
June, 2005
ISBN 978-3-540-26095-0

Semi-structured data sources, such as XML, HTML or CSV files, present special problems when performing data integration. In addition to the hierarchical structure of the semistructured data, the data integration must deal with the redundancy in semi-structured data, where the same fact may be repeated in a data source, but should map into a single fact in a global integrated schema. We term semi-structured data containing such redundancy as being an unnormalised data source, and we define a normal form for semi-structured data that may be used when defining global schemas. We introduce special functions to relate object identifiers used in the global data model to object identifiers in unnormalised data sources, and demonstrate how to use these functions in query processing, update processing and integration of these data sources.

PDF of full publication (133 kilobytes)
(need help viewing PDF files?)
BibTEX file for the publication
Conditions for downloading publications from this site. built & maintained by Ashok Argent-Katwala.