You are on page 1of 9

IR 2023

Section 5
Eng. Nesma Mahmoud
In this section
 Discusses sheet 0 and 1

 Discusses Task 1
Task 1
• Implement
1. Incidence matrix
2. AND, OR, NOT query on the incidence matrix
1 How to the
1. Prepare implement Incidence Matrix?
dictionary 2. Build the
matrix
• Consider these documents: (guide example)
– Doc 1 breakthrough drug for schizophrenia
– Doc 2 new schizophrenia drug
– Doc 3 new approach for treatment of
schizophrenia
– Doc 4 new hopes for schizophrenia patients
Each document
refers to a file
stored on your
device
1. Prepare the dictionary
1. Creat arrayList “AllTokens”: store the
dictionary terms
2. Read all documents (N)
– For each (or loop) document
• Tokenize each its content and add them to AllTokens
• Note: use Java StringTokenizer class
3. Lowercase and Sort all terms in “AllTokens”
2. Build the matrix
1. Create a matrix [Z][N] IncidMatrix
– Where: Z # of terms in AllTokens (length of
AllTokens) and N  # of documents
2. loop on each document
– Loop on each term in the AllTokens array
– If the term exist in the current document add 1
– Else add 0
2 AND, OR, NOT query
• Guide example:
3. Compare the two incidence
– answer the query “new AND drug”
vector by looping on them
4. Display the documents that
match the query

2. Retrieve the incidence vector for


drug & store it in array

1. Retrieve the incidence vector for


new & store it in array
2 AND, OR, NOT query
• Apply the same steps on OR and NOT but take
care off the differences
Task 1 deadline
• Next section

You might also like