You are on page 1of 6

Vector Space Retrieval Model: Basic Idea

ChengXiang “Cheng” Zhai


Department of Computer Science
University of Illinois at Urbana-Champaign

1
Course Schedule
3. Text Retrieval Problem
Small Relevant Data User
4. Text Retrieval Methods
11. Recommendation
5. Vector Space Model
2. Text Access
6. System
Recommender Implementation
Search Engine
System
7. Evaluation

8. Probabilistic Model
1. Natural Language Content Analysis
9. Feedback

10. Web Search


Big Text Data
2
Many Different Retrieval Models

• Similarity-based models: f(q,d) = similarity(q,d)


– Vector space model

3
Vector Space Model (VSM): Illustration
Programming

Query q d2 ?
dM
d3 ?
d5
Library

d4 d1 ?
Presidential
4
VSM Is a Framework

• Represent a doc/query by a term vector


– Term: basic concept, e.g., word or phrase
– Each term defines one dimension
– N terms define an N-dimensional space
– Query vector: q=(x1, …xN), xi is query term weight
– Doc vector: d=(y1, …yN), yj is doc term weight

• relevance(q,d)  similarity(q,d) =f(q,d)

5
What VSM Doesn’t Say

• How to define/select the “basic concept”


– Concepts are assumed to be orthogonal
• How to place docs and query in the space (= how to
assign term weights)
– Term weight in query indicates importance of term
– Term weight in doc indicates how well the term
characterizes the doc
• How to define the similarity measure

You might also like