Научная статья на тему 'Хранение данных XML через тонкое гранулярное отношение'

Хранение данных XML через тонкое гранулярное отношение Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
93
15
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
ХРАНЕНИЕ ДАННЫХ XML / ТОЧНОЕ ОТНОШЕНИЕ / JAVA

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Mo Jia

Показано, что хранение XML в реляционной базе данных может обеспечить не только ввод XML осуществляемый через Java, для контроля и управления данными XML. Кроме того, возможно улучшение хранения данных XML через более точную степень детализации данных.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Хранение данных XML через тонкое гранулярное отношение»

ИРКУТСКИЙ ГОСУДАРСТВЕННЫЙ УНИВЕРСИТЕТ ПУТЕЙ СООБЩЕНИЯ

Mo Jia

Y^K 004.65

STORAGE OF XML DATA THROUGH FINE GRANULARITY RELATION

Introduction

With the rapid development of Internet technology, XML has been increasingly used in E-commerce so that an ocean of data are stored and transmitted through it in many enterprises and units. Soon after the appearance of XML, a large number of data manufacturers have offered the XML data publishing in relational database. However, relational database is not expected to give a direct storage of such half-structuring data. Consequently, the XML storage and management are not given a strong support by relational database. In this case, the study on XML data has been a hot spot.

There are three ways to studies on XML database: NDX (Native XML DBMS), giving a direct management to XML storage and inquiry; XED(XML-enabled DBMS) , implementing the function of XML database through relational database; and Mixed, integrating NDX with XED.

This paper is dedicated to XML data storage based on relational database for the following consideration: relational database has enjoyed a mature development so that it can provide a myriad of data with effective access, concurrency control and restoring; the relational storage can complete the integrated management of any data no matter they are structural or half structural.

In terms of relational granularity, the fine granular relation model is selected since it offers one and only identifier to each structure in the document. What's more, each element, attribute and character data can be individually accessed, amended and deleted. The fine-granularity relation offers data management flexibility and convenience to the largest degree.

In the light of the principle concerning fine granularity, the said relational storage of XML data through Java is implemented. For the sake of simplicity and clarity, the XML demonstrations file to be transited is first given in figure 1. 1. The principle of fine-granularity relational storage of XML

The principle of fine-granularity methodology is that each structure in the document has its

own table including element, attribute and character data and a parent/child table combined by a documental table and element and their components including child element and child character data. The logic model in the relational data model based on fine granularity is illustrated in figure 2.

<example>

<person id="Big.Boss" contr="false">

<email>[email protected]</emai

l>

</person>

<book id="123456"> <name>XML

Fig.1 XML demonstrations file

docTable(doc_id int ,name String,root int) eleTable(doc_id int,ele_id int,parent_id int,tag String)

attTable(doc_id int,ele_id int,name String,value String) textTable(doc_id int,text_id int,ele_id int,value String)

childTable(doc_id int,ele_id int,child_id int,child_type int)

Fig.2 Fine granularity relational logic model

In the logic model, the docTable is composed of the three strings of docID, docName and the root element, among which, the root element connects the elements in docID and docName. And for each string, there is an XML document. As the most important part in the design, EleTable, constituting docID, ele-ID, parent eleID and eleLabel, links to other document structures. AttTable includes the name and value of the attribute and a call to element identifier. ChildTa-ble can access to all child elements and character data through the element linked by element identifier and

child identifier and also browse the top tree. The child type in the eleTable represents that child belongs to element or character data. 2. The implementation of the design

2.1. The Implementation flow

The flow is composed of three stages as showed in figure 3: the access to XML files; Dthe parse of it; the generation of SQL statement and the writing of XML data into database. Among them, the fist is easily done by class XMLParse to load XML files into memory while the third is the process of running database by means of JDBC. This paper aims to focus on the second step or the implementation of DOMT-ree.

2.2. The Major steps of implementation

The XML file is first read by XMLParse class and then parsed by DOM parser. This paper adopts first traversal to traverse it. By this method, the PrintTree function is used to get the class and value of the present node and then recursively called to traverse the nodes in the file. The mentioned algorithm is as follows:

for (Node child = node.getFirstChild(); child!= null;child = child.getNextSiblingO)

PrintTree(child);

Given the node type including document nodes, element nodes, text nodes and attribute node, switch (NodeType) is employed to decide the type in the PrintTree() function and the code for the relative run is

case Node.ELEMENT NODE: ele_id=eleList.size()+I; par_id=eleList.lastIndexOf(node.getParen tNode())+I;

String elesql="insert into eleTable (ele_id,par_id,tag) values ('"+ele_id+"',"

+ "'"+par_id+ "', "+ "'"+node.getNodeNam e()+")"; eleList.add(node);

NamedNodeMap atts = node.getAttributes();

for (int ii = 0; ii < atts.getLength(); ii++){

Node att = atts.item(ii); Stringattsql="insert into attTable (ele_id,att_name,att_value) values ( '"+ele_id+"',"

+ "'"+att.getNodeName() + "', "+ "'"+att.get NodeValue()+") ";breakU case Node. TEXT NODE:

String textsql="insert into textTable (text_id,ele_id,value) values ( '"+(++text_id)+"',"

+ "'"+ele_id+ "', "+ ""'+node.getNode Value ()+"')";

}

In the recursive invocation of PrintTree(), StringBuffer is used to save the generated SQL sentence. Take fig.1 as an example, the SQL sentences produced by DOMtree() are presented in figure 4.

Fig. 3. Store XML to relational database

Having generated a series of SQL sentences, the flow comes to the third stage. After the flow in figure 3, the XML document in figure 1 is saved in the relational database as presented in fig. 5-7. Until now, the conversion of XML data to relational database has completed. 3. Conclusion

In respect of relational storage of XML data, the mapping of XML data to relational data has been implemented through the fine granularity storage strategy by means of Java language. The project has showed that this mapping offers a flexible access to any node in XML and exerts little influence on other documental structures in the process of modification or deletion. Still, the relational storage of XML data with loop and the said XML query are the focus in our later study.

ИРКУТСКИЙ ГОСУДАРСТВЕННЫЙ УНИВЕРСИТЕТ ПУТЕЙ СООБЩЕНИЯ

insert into eleTable(ele_id,par_id,tag) val-ues('1','0','example')

insert into eleTable(ele_id,par_id,tag) val-

ues('2','1','person')

insert into attTa-

ble(ele_id,att_name,att_value) val-

ues('2','contr','false')

insert into attTa-

ble(ele_id,att_name,att_value) val-

ues('2','id','Big.Boss')

insert into eleTable(ele_id,par_id,tag) val-

ues('3','2','email')

insert into textTable(text_id,ele_id,value)

values('1','3','[email protected]')

insert into eleTable(ele_id,par_id,tag) val-

ues('4','1','book')

insert into attTa-

ble(ele_id,att_name,att_value) val-ues('4,'id','123456')

insert into eleTable(ele_id,par_id,tag) val-

Fig. 4. The generated SQL sentence

Fig.6. AttTable

eU id |aU nant Ht valut |

► 2 cunlr fil™

2 и Bif Eori

4 id 123456

т Fig.5 EleTable

tiKt id J e-Le_i d 1 valui

К 1 ■3 ohiifl&foo COJl

__ г S JIML boo!;

eLe id Jpar_i d J La?

_L t □ «■X ДЛ-I Jil

2. 1 p-ersoD

3 2 1L1

4 1 Ъ-о Lr

5 4 fiUhd

Fig.7 Character Database

REFERENCES

1. Liam Quin. Extensible Markup Language (XML). URL : http://www.w3.org/XML/.

2. Rourret R. XML and Database. URL : http://www. rpbourret.com/xml/XMLAndDatabase s.htm.

3. Jagadish H. V., AL-Khalifa, Chapman A., Timber. A native XML database // VLDB Journal. 2002. № 11 (4). P. 274-291.

4. Kanne C. C., Morekotte G. Efficient storage of XML data // ICDE. Los Alamitos. IEEE Computer Society. 2000. 198-208.

5. Jeffrey F, Naughton J, Dewitt D. The Niagara internet query system // IEEE Data Engineering Bulletin. 2001. № 24 (2). P. 27-33.

6. Shanmugasundaram J. Relational database for querying XML documents : Limitations and opportunities. Proc.of VLDB. Edinburgh, Scotland, 1999.

7. Philippe Le Hegaret. Document Object Model (DOM). URL : http://www.w3.org/DOM/.

Fan Yong, Xiong Yue

Y^K 008.510

THE COUNTERMEASURES RESEARCH ON CONSTRUCTING HARMONIOUS SOCIETY OF CHINA

Constructing a harmonious society in China is another major theoretical innovation to promote economic and social development in the new century and the new stage. The key to build a harmonious society is to further integrate social forces, adjust interest rela-

tionship among members of the community, to minimize social conflict and achieve social equity. Harmonious social relation is a necessity of building harmonious socialist society; moreover, labor relation is the most common, most basic and most important so-

i Надоели баннеры? Вы всегда можете отключить рекламу.