You are on page 1of 3

TRNG HBK H NI

VIN CNTT&TT
--------------

THI KT THC HC PHN


TCH HP D LIU V XML IT4843
(M lp: 62365)
Thi gian: 90 pht.
c s dng ti liu giy.
Khng c s dng my tnh, cc thit b thu pht tn hiu.

Cu 1: (3)
Hy son mt ti liu DTD v mt ti liu XSD ti liu XML books.xml sau l hp l:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies.</description>
</book>

</catalog>

p n:
XSD:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="catalog">
<xs:complexType>
<xs:sequence>
<xs:element name="book" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="author" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="genre" type="xs:string" minOccurs="0"/>
<xs:element name="price" type="xs:string"/>
<xs:element name="publish_date" type="xs:date"/>
<xs:element name="description" type="xs:string" minOccurs="0"/>
</xs:sequence>
<xs:attribute name="id" type="xs:string"></xs:attribute>
</xs:complexType>
</xs:element>
</xs:sequence>

</xs:complexType>
</xs:element>
</xs:schema>
XML:
<!ELEMENT catalog (book+)>
<!ELEMENT book (author,title,genre?,price,publish_date,description?)>
<!ATTLIST book id CDATA #REQUIRED>
<!ELEMENT author (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT genre (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT publish_date (#PCDATA)>
<!ELEMENT description (#PCDATA)>
Cu 2: (2)
Cho file books.xml nh cu 1, hy vit cc biu thc ng dn XPath :
a) Chn phn t title l con ca phn t book cui cng.
b) Ly ra phn t book c phn t price vi gi tr ln hn 25.00
p n:
Cu 3: (3)
Cho mt tp cc xu S = {s 1, s2, s3} vi s1 = "Nguyn Anh Tun", s2 = "Nguyn c Tun" v
= "Nguyn Tun Tr".
Trnh by cch tnh im tng ng TF/IDF gia hai chui s1 v s2 l
p n:
Tnh TF:
Vi mi t t, ti liu d. Cng thc tf(t,d)=s ln t sut hin trong d
tf(Nguyn,s1)=1
tf(Anh,s1)=1
tf(Tun,s1)=1
tf(c,s1)=0
tf(Tr,s1)=0
tf(Nguyn,s2)=1
tf(Anh,s2)=0
tf(Tun,s2)=1
tf(c,s2)=1
tf(Tr,s2)=0
tf(Nguyn,s3)=1
tf(Anh,s3)=0
tf(Tun,s3)=1
tf(c,s3)=0
tf(Tr,s1)=1
Vi mi t t, tnh idf(t) = tng s ti liu trong b su tp chia cho s ti liu cha t.
Idf(Nguyn)=3/3=1
Idf(Anh)=3/1=3

s3

Idf(Tun)=3/3=1
Idf(c)=3/1=3
Idf(Tr)=3/1=3
Ta c bng vector ca d.
VS1
VS2
VS3

Nguyn
1
1
1

Anh
3
0
0

Tun
1
1
1

c
0
3
0

Vd(t)=tf(t,d).idf(t)
V d tnh mt vi gi tr nh sau:
VS1(Nguyn)=tf(Nguyn,S1).idf(Nguyn)=1.1=1
VS1(Anh)=tf(Anh,S1).idf(Anh)=1.3=3
Cng thc tnh im tng ng TF/IDF:

im tng ng TF/IDF gia hai chui s1 v s2 l


S(S1,S2)=(1.1+1.1)/(sqrt(1+9+1)*sqrl(1+1+9))=2/11
Cu 4: (2)
Hy trnh by cc l do khin cho tch hp d liu l mt vn kh.
---------------HT---------------:
Son thi

Duyt thi

Tr
0
0
3