You are on page 1of 17

TopicSummary A Tool for Analyzing Class

Discussion Forums using Topic Based


Summarizations
Prof Gottipati Swapna, Prof Venky Shankararaman
and Renjini Ramesh
Professor Venky Shankararaman
Deputy Dean (Practice & Education)
School of Information Systems
Singapore Management University
venky@smu.edu.sg
Agenda
 Problem Statement
 Conceptual Framework
 Topic Based Summarization (TBS) Tool Solution
Design
 TBS Evaluation
 Summary

FIE 2019 2
Learning Through Discussions

Active Student- Student Student-Instructor


Learning Interaction Interaction

Live
Online
Class

FIE 2019 3
Problem Statement
Knowledge
Generation

 Online discussion: Discussion threads related to the


topics discussed
 Analysing this knowledge repository can generate
useful information and insights:

Instructor: Gain insights on which topics had more
student participation

Student: Obtain a summary of the discussion
 Current work: Analyse ODFs and automatically
generate topic based summaries
FIE 2019 4
Need for Summarization
 87.5 % of students agreed that the summarised views of the topics
discussed in a question thread can help them in learning

 Manually analysing such knowledge repositories and generating


topics and summaries is a pain staking process
Name Forums Question Title Questions Body
Student 1 Subway Performance Q1: What are the 1. Reducing the total time
process Target performance targets taken for each customer from
case for New Subway ordering to payment
study sandwich sales (efficiency) 2. Increase total
process in addition to manpower (efficiency)….
the initial targets?
Student 2 Subway Improvements Q2: What are some 1. Inrease the number of
process & Rationale recommendations to manpower in the shop so as
case improve the process to speed up the process of
study and state the making sandiwches because
rationale for it? there will be more staffs to
cater to more customers.2. …

FIE 2019 5
Summarizing Posts from the Discussion Forum

FIE 2019 6
Text Summarization Approaches

Abstractive Extractive
Summarization Summarization
• Understand • Extract
the main important
concepts of a sentences
document from
• Generate new document
sentences
which are not
seen in the
original
document
FIE 2019 7
Text Summarization Approaches

Supervised Unsupervised
Summarization Summarization
• Use data sets • Do not use
that are annotated data
labelled by • Use linguistic
human and statistical
annotators information
obtained from
the document
itself

FIE 2019 8
Solution Design
• Convert into lower case, remove trailing and ending
spaces
Data • Tokenize each post into sentences, remove stop words
Processin • Perform Lemmatization using NLTK’s WordNet based
g Lemmatizer API

• Cluster a group of posts and label each cluster based on


the content of text within the particular cluster using
Topic TDK techniques
Extraction • k-mean clustering algorithm using the tool CLUTO

• Each post is first tokenized into sentences. Sentences


are ranked by score, and then highly ranked sentences
Summary are chosen.
Generatio • Instructor can select either the TextRank Summarizer
n or LSA Summarizer technique

FIE 2019 9
Data Set
 Business Process Modelling and Solutioning
 2nd year undergraduate course within the BSc (IS)

Example Learning Outcome


 Identify process bottleneck and propose an

improved process through use effective use of


technology

FIE 2019 10
Data Set
 The students were given a case study of the sales
process currently implemented in a sandwich shop
(e.g. Subway)

 Students were required to read this case study


before participating in the online discussion forum

 The discussion forum was setup in the LMS and


relevant questions were added by the instructors

 The students subsequently submitted their posts for


each question
FIE 2019 11
Data Set

Thread Question # of posts


Q1: What types of analysis can be done on the current 42
process? Describe with examples
Q2: What are some recommendations to improve the 89
process and state the rationale for it?
Q3: What are the performance targets for the new 129
subway sandwich sales process?

FIE 2019 12
User Interface: Input Discussion Forum Data

FIE 2019 13
Visual Dashboard: Topics and Summaries

 Most descriptive set of words for the cluster


 Size: Number of mentions of a topic in the cluster
 ISIM: Displays the average similarity between the objects of each cluster (i.e., the internal
similarity).
 ESIM: Displays the average similarity of the objects of each cluster and the rest of the
objects (i.e., external similarities)

FIE 2019 14
Tool Evaluation
 What are some recommendations to improve the process and
state the rationales for it?”
 Cluster: “Resources for reducing cycle time”

Cluster top representative words: resource time reduce cycle waiting


TextRank summarizer
Automating manual task will allow an overall reduction in average process cycle time, leading to
higher capacity of production. During such periods of congestion and heavy footfall, the
manager might be required to take on a more "directive" or managerial position to
ensure processes are followed, rather than be the hands-and-feet in the activities. Better
allocation of resources: may be can appoint one person in charge of completing meals and
place bread in toaster to save overall time duration.
Latent Semantic Analysis (LSA)
Invest in better and more efficient toasters (shorter toasting time and to enable more
sandwiches to be toasted at one time) -- this can help reduce the preparation time for each
sandwich and thus improve efficiency. reduce time to process each order by expanding
employee job scope (letting them handle more than role)reduce customer wait time through
pre-orders/ online ordering effective use of resources through reallocating staff from the
affected branches to the other branches to cope with peak periods. Adding resources(e.g
hire another cashier and another machine) would help to reduce the process cycle time as
when there is cashier, the process cycle time will be reduce by half.

FIE 2019 15
Tool Evaluation

Cluster Topic: Top representative words


What types of analysis can be done on the current process?
1 utilisation understand resource utilization
2 process analysis current cycle time
3 construction customer entire process wait
4 bottleneck peak complain time waiting
5 day produced sandwich prepare sold
6 solution manpower shortage due branch
7 manager analysis activity path analysis determine
What are the performance targets for the new subway sandwich sales
process?
1 waiting time peak hour queue
2 target include initial addition performance
3 manual task reduce manpower efficiency
4 cost reduce process sale sandwich sale
5 average time total day customer

FIE 2019 16
Summary
 We presented a text mining based approach to
analyse the discussion forums and generate topic
based summaries

Future Work
 Improve spell check performance

 Provide input to instructor on optimum cluster

number using preliminary data analysis


 Generate pdf file of the summary which the

instructor can share with the students


 Conduct a survey on the usefulness of the TBS tool

towards supporting learning


FIE 2019 17

You might also like