UMP Institutional Repository

Evaluation of XML documents queries based on native XML database

Lazim, Raghad Yaseen (2016) Evaluation of XML documents queries based on native XML database. Faculty of Computer Systems & Software Engineering, Universiti Malaysia Pahang.

[img]
Preview
PDF (Evaluation of XML documents queries based on native XML database-Table of contents)
Evaluation of XML documents queries based on native XML database-Table of contents.pdf - Accepted Version

Download (244kB) | Preview
[img]
Preview
PDF (Evaluation of XML documents queries based on native XML database-Abstract)
Evaluation of XML documents queries based on native XML database-Abstract.pdf - Accepted Version

Download (355kB) | Preview
[img]
Preview
PDF (Evaluation of XML documents queries based on native XML database-Chapter 1)
Evaluation of XML documents queries based on native XML database-Chapter 1.pdf - Accepted Version

Download (342kB) | Preview
[img]
Preview
PDF (Evaluation of XML documents queries based on native XML database-References)
Evaluation of XML documents queries based on native XML database-References.pdf - Accepted Version

Download (551kB) | Preview

Abstract

As the amount of data available on the Internet grows rapidly, more and more of the data becomes semi structured. The Extensible Markup Language (XML), as a format for semi structured data, has become a standard for the representation and exchange of data over the Internet. Early in the XML history there were thoughts about whether XML is different from other data formats that require a database of its own. The popularity and wide-spread use of XML among a diverse set of organizations has engendered a rethinking of the storage and retrieval practices for data. Most early XML storage practices relied on mappings and transformations between XML data trees and relational database tables within a Relational Database. Though relational databases can represent nested data structures by using tables with foreign keys, it is still difficult to search these structures for objects at an unknown depth of nesting; by contrary, it is a potential advantage in XML. Also, the nested and repeating elements in XML documents can quite easily result in an unmanageable number of tables. Furthermore, it is usually very difficult after insertion to change the relational schema due to XML schema changes. The limitations of relational approaches are now well known. Moreover, local update to the document should not cause drastic changes to the whole storage system. Therefore, the design of the storage system should trade-off between the query performance and update costs. This study is to evaluate the Native XML database (NXD) performance in a comparison with XML_Enabled Database (XED), and then to ellhance· Entity Relationship (ER) algorithm of the relational schema for the improvement of Insert, Delete, Update and Search XML document (XML files with a large number of elements) and finally, to validate the algorithm in NXD and compare the performance ofXED and NXD, by implementing the same command and control data model. Five different sizes of datasets have been used (65.8, 101, 117, 127, 183 MB). Benchmark techniques is used to measure the performance. XMark and XMark-1 are two main tools of Benchmarks in the research field, and they have used for the dataset. The performance of a system can be measured by using datasets of. varying sizes, different documents with different features. The size of XML documents and the number of elements have been determined by the factor of the main driver of generation. The result of this study shown that XED has better performance for the datasets <= 117 MB. The performance of XED begins to decline with the increase in the size of XML data(> 127 MB), while NXD shown better performance in for the data(=> 127 MB). NXD produced better results in the reporting section, which implies that the Nf{D X-Query has performance gains from query optimization. Most of the figures show that the XED starts better, but becomes worse as data size grows. The difference becomes obvious as the query becomes more complicated.

Item Type: Undergraduates Project Papers
Additional Information: Thesis (Master of Computer Science) -- Universiti Malaysia Pahang – 2016, SV: DR ADZHAR BIN KAMALUDIN, NO CD: 10760
Uncontrolled Keywords: XML documents; XML database
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Faculty/Division: Faculty of Computer System And Software Engineering
Depositing User: Ms. Nurezzatul Akmal Salleh
Date Deposited: 07 Jul 2017 02:26
Last Modified: 07 Jul 2017 02:26
URI: http://umpir.ump.edu.my/id/eprint/18104
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item