BI Assignment

Running Head: FIVE QUESTIONS 1
Five Questions
Student’s Name
Institutional Affiliation
Professor’s Name
Course Date
This study source was downloaded by 100000831236167 from CourseHero.com on 10-08-2022 01:46:52 GMT -05:00
https://www.coursehero.com/file/66078618/BI-assignmentdocx/
FIVE QUESTIONS 2
1. Explain the relationship among data mining, text mining, and sentiment analysis.
To understand the relationship between the three, let us first recall their definitions. Text mining
refers to a set of processes needed to convert unstructured information or resources into valuable
structured data (Aggarwal et al., 2012). Text mining needs both sophisticated statistical and
linguistic techniques to analyze unstructured information formats and methods which are able to
combine all the information with actionable metadata (Aggarwal et al., 2012). Data mining can
be defined as the process which depends on algorithms to extract and analyze useful content
from data. It’s a process that can be used to discover hidden relationships and patterns in datasets
(Aggarwal et al., 2012). Sentiment analysis refers to the process of categorizing and
computationally identifying the opinions which have been expressed in a piece of data
(Aggarwal et al., 2012). The relationship between the three is that text mining is the application
of data mining which has been specified by sentiment analysis. Text mining also relates to
sentiment analysis as it’s used for analysis so as to recognize data patterns and their analytics
(Aggarwal et al., 2012).
2. In your own words, define text mining, and discuss its most popular applications.
In my own words I can say text mining is the process followed when one is transforming text
data which is unstructured into actionable and meaningful information (Aggarwal et al., 2012).
The following are the most popular applications of text mining: knowledge management, risk
management, spam filtering, business intelligence, social media information analysis, contextual
advertising, detection of fraud via claims investigation, prevention of cybercrime, and customer
care service (Aggarwal et al., 2012).
FIVE QUESTIONS 3
3. What does it mean to induce structure into text-based data? Discuss the alternative
ways of inducing structure into them.
To induce structure into text-based data applying and adapting algorithms for mining information
using the iterative process of substitution and selection to present the information which
contains terms of interest (Younis., 2015). The following are ways of inducing structure into
them: First, isolating key words. In this approach, tokenizing is used to split the entire body of
the text into simpler and individual words. You are supposed to think of the words as tokens for it
to be effective. The second way in determining topics. This needs the text to be categorized by
its main subject matter. It depends on the data source. The last way I will discuss is measuring
sentiment. This approach involves measuring the tone using sentiment analysis (Younis., 2015).
4. What is the role of NLP in text mining? Discuss the capabilities and limitations of NLP
in the context of text mining.
The role of NLP in text mining is to perform linguistic analysis to help the computer read the
text. Its capabilities are: first, it can decipher the ambiguities found in language used by humans
since it uses several methodologies. Secondly, it can do the following, extract entity, perform
automatic summarization, and disambiguation. For all this to be effective it requires a knowledge
base that is consistent. The limitations of NLP in text mining are: there is variety and ambiguity
in text. This is because humans are more creative when using their language thus different text
contexts have meanings that are different (Younis., 2015).
5. Go to explore the sections on applications as well as software. Find names of at least
three additional packages for data mining and text mining.
The packages are: Quanteda, Text2vec, and Tidytext.
FIVE QUESTIONS 4
References
Aggarwal, C. C., & Zhai, C. (Eds.). (2012). Mining text data. Springer Science & Business
Media.
Younis, E. M. (2015). Sentiment analysis and text mining for social media microblogs using
open source tools: an empirical study. International Journal of Computer Applications,
112(5).
Powered by TCPDF (www.tcpdf.org)

BI Assignment

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BI Assignment

Uploaded by

Copyright:

Available Formats

Running Head: FIVE QUESTIONS 1

(Aggarwal et al., 2012).

care service (Aggarwal et al., 2012).

ways of inducing structure into them.

in the context of text mining.

contexts have meanings that are different (Younis., 2015).

5. Go to explore the sections on applications as well as software. Find names of at least

three additional packages for data mining and text mining.

The packages are: Quanteda, Text2vec, and Tidytext.

open source tools: an empirical study. International Journal of Computer Applications,

You might also like