# Introduction to algorithms, Assignment 3: Search Engine

Johan Sivertsen March 15, 2012

1

1 Data structure and resource consumption I choose to make a data structure consisting of a hastable where the keys are strings and the values are hashsets.readLine(). line. The time for constructing the data structure should be O(n*m).set). 6 } 7 else{ 8 for (Integer i:list){ 9 if(!table.Assignment 3 Introduction Johan Sivertsen Algs 2012.} 3 else{ 4 if(list.split("\\W+")){ 2 if(table.get(word. get and remove data in constant time.clear(). All of these operations are essential for the assignment and choosing a data structure that allows me to perform them in constant time makes the search very fast. All of these run in constant time.split("\\W+")){ HashSet<Integer> set = table.get(word)).add(i). and (3) space usage of your data structure. ITU The report should motivate the data organization you used. a comparison(4). and a hashtable put(6).toLowerCase(). table. This allows me to put.a hashset add(5). Search time The time for search is decided by: 1 for (String word:queryString. if(set ==null) set = new HashSet<Integer>().toLowerCase()). (2) time for building the data structure.get(word)==null) {list. This is executed for every line in the le.add(linecount). break. } linecount++.contains(i)){ 10 toDelete. set. It is liniar in regards to the number of search words.addAll(table. For every line. for every word I have a hashtable lookup(3). 2 . and give some complexity bounds on (1) time for searching.isEmpty()){ 5 list. for (String word:s. I will use n as the number of lines and m as the average number of words pr. 0. For following code builds the data structure: 1 2 3 4 5 6 7 8 String s = r. It also makes it very easy to code because I can make full use of the Collection interface.put(word.get(word). This operation takes constant time in regards to n. 11 } 12 } This performs a lookup for every word in the query containing an iteration of the found strings with another lookup for each.

This should not be very space consuming.Assignment 3 space usage Johan Sivertsen Algs 2012. ITU The hashtable needs to store a String and a hashset of integers for each unique word. 3 .