BTP Mid Sem

REVIEW BASED QUESTION
ANSWERING FOR ONLINE

SHOPPING PORTALS
Team:
1. Ankush Babbar (222CO14)
2. Anshul Gupta (224CO14)
Mentor:
Prof. Shampa Chakraverty
PROBLEM INTRODUCTION
 Our aim is to create a system for automatically
answering queries related to products on online
shopping websites like Amazon, using reviews.
 Consumer reviews are invaluable as a source of data

to help people form opinions on a wide range of
products.
2
Example
3
MOTIVATION: WHY ARE
REVIEWS IMPORTANT IN QUESTION
ANSWERING ?
o Complex Queries cannot be answered by building a
knowledge-base. This is handled by casting the problem
as one of surfacing relevant opinions.
o Many of the questions users ask are regarding subjective

personal experiences, which cannot be answered using
product specifications.
4
RELATED WORK
Title Author/ Objective Limitations
Publication
Selecting Sentences for Yllias Chali, Answering Queries Relevance
Answering Complex Shafiq R. using ‘query- function is not
Questions (2008) Joty focused’ learned
summarization
Knowledge-Based F. Rinaldi, J. Answering Queries Cannot answer
Question Answering Dowdall, M. by constructing complex queries
(2003) Hess, knowledge base in our scenario
D. Moll´a & from product
K. Kaljurand specifications
Addressing Complex Julian Interface between Does not
and Subjective Product- McAuley & Opinion Mining & account for
Related Queries with Alex Yang, QA Systems sentiment score
Customer Reviews (WWW) & semantic
(2016) similarity
Bridging the Lexical A. Berger Answering Queries Documents are
Chasm: Statistical R.Caruana, using Statistical heterogeneous,
Approaches to Answer- D. Cohn, Models so BOW 5
Finding (2010) D.Freitag, V. representations
Mittal are insufficient
DATASET USED
 This Amazon review dataset is compiled by a
research team at UC, SD using web crawler.
6
APPROACH
 Our system needs 2 components:
1. Relevance Function
2. Scoring Function
 These are computed using:

 Relevance Ranking
 Cosine Similarity
 Okapi BM25 (TF-IDF based)
 Bilinear Model:
7
MIXTURES OF EXPERTS
 Mixtures of Experts (MoEs) are used to combine the
outputs of several classifiers.
 For a binary classification task, each expert outputs a

probability associated with a positive answer(‘yes’).
8
DATA PIPELINE
9
EVALUATION METRIC : AUC
 Area under Curve is the Area

under the ROC curve.
 It lies between 0.5 and 1.
 AUC equals the probability that

the true answer should be given
a higher score than a (randomly
chosen) non-answer.
ROC curve is created by plotting the
true positive rate (TPR) against the
false positive rate (FPR) at various10
threshold settings.
PLATFORM & TOOLS
Specifications
CPU Intel Core i7
RAM 8 GB
GPU 4 GB
Operating System Windows 10
Language Python
Machine Learning Framework TensorFlow
11
Database SQLite
EXPERIMENTS & RESULTS
We tried 2 approaches for question answering:
1. Mixture of Opinions for Question Answering

We identify relevant user reviews, and combine
them to answer complex queries.
AUC – 0.862
2. Mixture of Descriptions for Question Answering

Same as MOQA, but reviews are replaced by
product descriptions.
AUC – 0.783
12
EXPERIMENTS & RESULTS
Due to limited processing power and RAM, our
system could handle up to 100,000 total reviews.
We trained our model on 2 datasets:
1. 150 products, ~600 reviews per product.

More reviews for every product results in queries
being answered more precisely.
AUC – 0.862
2. 700 products, ~140 reviews per product.

Large variety of products means a wider vocabulary,
resulting in better identification of contextually similar
words.
13
AUC – 0.837
CONCLUSION
We presented MOQA - a system that automatically
responds to product-related queries by surfacing
relevant consumer opinions. We concluded:
 Reviews proved particularly effective as a source of data for

answering product-related queries, outperforming product
specifications.
 We trained a bilinear model that is capable of accounting

for linguistic differences between text sources,
outperforming hand-crafted word- and phrase-level
relevance measures.
14
FUTURE WORK
 Use Semantic Similarity measures like Wordnet and
Word2Vec to account for contextually similar words. For
example, “heavy” and “weight”.
 Addressing compatibility-related queries with user reviews

by making use of reviews of both products, or co-purchasing
statistics.
15
Thank You
Ankush Babbar (222CO14)
Anshul Gupta (224CO14)

BTP Mid Sem

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BTP Mid Sem

Uploaded by

Copyright:

Available Formats

REVIEW BASED QUESTION

ANSWERING FOR ONLINE

 Consumer reviews are invaluable as a source of data

o Many of the questions users ask are regarding subjective

 These are computed using:

 For a binary classification task, each expert outputs a

 Area under Curve is the Area

 It lies between 0.5 and 1.

 AUC equals the probability that

Operating System Windows 10

Machine Learning Framework TensorFlow

1. Mixture of Opinions for Question Answering

2. Mixture of Descriptions for Question Answering

1. 150 products, ~600 reviews per product.

2. 700 products, ~140 reviews per product.

 Reviews proved particularly effective as a source of data for

 We trained a bilinear model that is capable of accounting

 Addressing compatibility-related queries with user reviews

You might also like