Professional Documents
Culture Documents
Semantic Web
(Project of Artificial Intelligence)
Submitted By : Ayush Gupta (7503854) Sajal Gupta (7503878) Yatin Wadhawan (7503879)
Table of Contents
I. Introduction ...................................................................................... 3 II. Database ........................................................................................... 5 III. Database : Index ............................................................................... 7 IV. Algorithm Design .............................................................................. 8 V. Screenshots .................................................................................... 11 VI. Bibliography .................................................................................. 14
I. Introduction
The Semantic Web
The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. -- Tim Berners-Lee, James Hendler and Ora Lassila, The Semantic Web, Scientific American, May 2001 The Semantic Web is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. You can think of it as being an efficient way of representing data on the World Wide Web, or as a globally linked database. The Semantic Web is generally built on syntaxes which use URIs to represent data, usually in triples based structures: i.e. many triples of URI data that can be held in databases, or interchanged on the world Wide Web using a set of particular syntaxes developed especially for the task. These syntaxes are called "Resource Description Framework" syntaxes.
II. Database
Table name : Category ( Cat_id varchar(20) PRIMARY KEY, Category varchar(20) )"; $table2="create table category
model varchar(20) PRIMARY KEY, make varchar(20) PRIMARY KEY, air conditioner varchar(20), power windows varchar(2), power stering varchar(6), antilocing breaking system varchar(6), airbags varchar(6), leather seats varchar(6), cd player varchar(20), overall length varchar(20),
overall width varchar(20), overall height varchar(20), kerb height varchar(20), mileage varchar(20), seating capacity varchar(20), no of doors varchar(20), transmission typevarchar (20), gears varchar(20), minimum turning radius varchar(20), tyres varchar (20)
)";
varchar(20)
} ?> Step 4.3: Fetching Site Markup : The constructor is having the URL as the argument and fetches the Markup i.e. the code of the given URL. <?php public function __construct($uri) { $this->markup = $this->getMarkup($uri); } public function getMarkup($uri) { return file_get_contents($uri); } ?> Step 4.4: Crawling The Markup For Data: The pages fetched by the get() function are crawled for the protected data collection method simply by using the string operations and the PCRE (Perl Compatible Regular Expressions) function preg_match_all() in order to return all tags within the markup that are acceptable. E.g. <img([^>]+)\/>/i and /<a([^>]+)\>(.*?)\<\/a\>/i. Step 4.5: All the links are collected after filtering and removing the undesired part from URLs fetched, into an array links. Step 4.6: Repeat the process all the websites linked later, thus goto Step 4.2
Step 5: Populating the database Step 5.1: The links are sent to the index function as an argument one by one till the end of an array. Step 5.2: One temp.txt file is maintained which contains the markup of the sites. Whitespaces and the html tags are removed then.
Step 5.3: Now the word entry is made in the Word table and is provided the unique id (word_id). Similarly the pages are updated to the database as page_url and unique id (page_id) is provided to it also. Step 5.4: fetch the word_id fron the word table corresponding to the search keyword. And,now count the total occurrences of the word corresponding to its ID in all the pages. Update them in Occurrence table. Step 5.5: Sort the links in descending order and display the links in decreasing order of their occurrence from occurrence table. Step 6: After displaying. To let the user to input next keyword goto Step 1.
Step 1: Saved the source code of the base site on basis of make and model argued by the user. Step 2: the source code so generated after the step 1 is again studied and searched for the specifications of the model asked by the user. Step 3: then designed the program to fetch the desired information from that page into an array. Step 4: then dumped the array into the database and displayed it to the user. Step 5: if the user again searches for the same model and make then the details will bedisplayed but the database remains the same with no updation thus saving the processing time.
V. SCREENSHOTS
HOMEPAGE
VI. Bibliography
1) http://semanticweb.org/wiki/Main_Page 2) www.php-manual.net 3) www.cars.com 4) http://www.altova.com/semantic_web.html