You are on page 1of 5

CHAPTER 5

The internet provides a significant amount of qualitative data in the forms of conversations, messages,
photographs, music clips, videos. drawings, avatars, comments, discussions, and much more
DATA MINING
It has traditionally been used for quantitative data. In the last decade it has been applied to online
conversations and connections by mining and scraping text and relational data from their original source
collecting it in ad hoc an predetermined ways, and then analysing or sorting it en masse using
automated/semi-automated/computerised/software driven processes. It is used to enhance, develop,
quide and validate findings from more contextualising methods such as netnography.
For consumer and marketing researches various topics can be approached with data mining techniques.
The biggest issue in data mining is related to how to analyse the data.
Data mining is the process of discovering useful patterns of knowledge from sources of data like
databases,websites, text files, images or videos. It is the attempt to make sense of large amounts of
mostly unsupervised data in some domain. We deal with large amounts of data. Unsupervised indicates
that it is naturalistic data i.e. data for which the analyst has no predefined classes or categories.
Finally, mining is a process of sense making ore knowledge creation, in which operaions on vast
amounts of qualitative data should render it understandable, valid novel and useful.
Data mining seeks to discover useful info or knowledge from the info available in a particular database.
Content mining is a variety of data mining that treats data more widely, often including visual images,
audio-visual flies and sound files. Web mining is a subset of datamining that aims to discover useful
info or knowledge from web hyperlinks, page contents and usage logs.
The most important distinguishing characteristic of datamining is that, rather than beginning with a
particular model and then fitting data to it, data mining attempt to begin with the data. i.e. although its
methods are mathematichal, the approach is inductive: data driven. It begins with large datasets
and builds a data model that is not overly complex but still describes data well. Data mining intersects
with qualitative methods that seek to find patterns of meaning from complex naturalistic situations.

 The data analyst/miner identifies suitable data sources and target forms of data
 Raw data is pre-processed or cleaned to remove noise or abnormalities
 Pre-processed data is processed further by data mining algorithm that tries to recognize or
represent patterns in the data
 Identification of patterns that are useful for intended applications and rejection of those that are
not.
 Entire data mining process is iterative, taking multiple rounds to achieve results.
Data can be aggregated in a process known as unsupervised learning. Data have no pre-determined
or pre-arranged categories assigned to them. A computational algorithm must find the hidden
commonalities and regularities in the data. A key method is clustering, in which data are organized
into clusters based on their similarities or differences.
Supervised learning is probably the most frequently used data mining technique in practice.
Supervised learning is a form of classification in which a category or classifier function is learnt
from data that has previously been labelled with similar pre-defined classes or categories. That
classifier is then applied to place other, new but similar data into those classes. Because the existing
classification supervises the process, it is known as supervised learning.
Database methods can combined supervised and non-supervised learning in a process known as partially
supervised learning
OPINION MINING
Opinion mining works with the large amounts of naturally occurring or unstructured text present on the
web. Usually t operates only on the text of user generated content/media, because the processing of
images, video, music graphics and sound files is still too complex for current computational algorithms
to handles (but not too complex for netnography).
It attempts to measure online WOM. It is technically very challenging because it needs to use natural
language processing NLP, a type of info processing that recognizes info naturally occuring in
language. However, while human beings will instantly understand irony, sarcasm and idiosincratyc
spelling, this modes of representation are likely to confuse software programs.
Three key elements in opinion mining

 First sentiment classification. Data mining algorithm needs to determine whether a text
epxpresses negative/positive emotions.
 Second feature -based. Next the algorithm has to identify the feature/benefit attribute is
being commented. E.g. “Samsung screen tablets are better than apples” has a negative
feeling and the attribute is the screen.
 Comparative mining. One object is compared against one or more objects
 Strength or passion of opinion can be assessed.
There are user friendly software that help us with data mining. These types of search engines will crawl
the web and mine any areas that they can. Opinions appearing on top-rated and highly visited blogs
might be weighted in indices more heavily than opinions appearing on new blogs with few followers.
Opinions can also have temporal dimension that can reveal trends over time.
SOCIAL NETWORK ANALYSIS
Social network analysis or SNA is a technique that looks at social relationships as networks, and
considers both the structure and the patterns of their linkages. There are two important elements in
social networks: social actors (nodes) and the relationships between them (ties).
Social Network Analysis considers these sorts of relationship interesting because they possess recurring
patterns. These patterns are relevant to an understanding of the online social space because patterns in
such key marketing matters as WOM and the diffusion of new products, we are interested both in the
influence of individuals within a social network and in the patterns of the spread of influence and
adoption.
Strong ties indicate freindhisp ore close relationships, revealing kinship, intimacy and frequent contact.
COLLECTING DATA FOR SOCIAL NETWORK ANALYSIS
In the age of internet colletion od social network data is based on methods directly related to data
mining.
For consumer and marketing research interested in WOM, influence, and diffusion of products,
communications and ideas on the internet, SNA can be a valuable approach. For applied marketers,
SNA can be extremeòy useful for informing segmentation, targeting and positioning decisions as well as
directing tactical efforts.
INTRODUCTION TO NETNOGRAPHY
Both datamining and SNA look at the qualitative data on the internet as a type of content that must be
decontextualized or processed in different ways in order to reveal more general patterns of common
topics, structures or influence. However there is another, complementary way to view this qualitative
data, called netnography. In netnography, researchers view qualitative data as indicative of cultures or
communities. Using techniques prevalent in anthropology and sociology, marketing and consumer
researchers can study social media and online communities as cultural phenomena.
Social media are media for communication that use accessible and scalable formats and that are open to
large groups, or even to the entire public. If we consider these media o be truly social, then social
methods that’s tudy the interactions between people as a cultural phenomenon are entirely appropriate,
and can reveal important aspects on online behaviour, such as the values, meanings, language, rituals
and other symbol system that consumers create when they share and create culture online.
Netnography is a form of ethnographic research adapted to the unique contingencies of various types of
computer-mediated social interactions. Netnography offers a common language, a common
understanding and a common set of standards for engaging in research practice.
First we need to understand the differences between face-to-face social interactions and online social
media interactions. There are 4 main differences

 Alteration. The nature of the interaction is altered, both contrained and liberated, by the specific
nature and rules of the technological medium in which it is carrie. (netiquette like emoticons 😉
(^_-) )
 Anonymity. Or pseudonymity alters social interactions online.
 Accessibility. Many online forums are open to participation by anyone. They compise a hybrid
form of public and private communications.
 Archiving. Communications that take place in digital format areinstantly stress.
DATA COLLECTION IN NETNOGRAPHY
Netnography is ethnography in the social spaces an online environments. It involves taking an active
approach to online research that seeks to maintain and analyse the cultural qualities of online
interaction. It is about researcher immersion in the full cultural complexity on online social worlds.
Therefore data collection in netnography means learning deeply from and consequently communicating
meaningfully with members of an online community.
Netnographic participation drives netnographic data collection. Participation does not necessarily mean
reaching out to members with posts which ask them questions, as in interviews.
Although participation can be visible to other community members, and preferably it will contribute to
their communalinterests and wellbeing, it can also involve other types of actions. The key guideline is
that a netnographer should participate in the community at a level that is appropriate for a member.
Line ethnography, netnography is based on the twin and intertwining methodological pillars of
participation and observation. Netnography involves coding but it is not merely coding.
[WHAT IS NETNOGRAPHIC PARTICIPATION?]
There are 3 types of data:
1. Archival data. Data that the researcher copies from pre-existing files and records or, less ofte,
creates himself. Archival data is purely observational in that it has usually been created and
shared by social media community members or by third parties like the library or the congress.
This is data the researcher has not been directly involved in creating or prompting.
2. Elicited data. Is the data the researcher co-creates alongside social media community members.
It includes researcher’ posting and comments. Because this data has the researcher’s influence
built into it, it is analysed differently from data that does not have his stamp of influence, a
sorting of data not usually possible in collected ethnographic data. However this is not
contamined or impure data by any means as it can be much more focused and valuable in
answering crucial research questions.
3. Fieldnote data. Comes from personal and research-related descriptive and reflexive fieldnotes
that a researcher has created. This inscription process should should capture netnographer’s
impressions and observations of the social media community, its members and memberships, its
practices, members’ social interactions and meanings, the researcher’s own participation and
sense of membership and much more.
BASIC PRINCIPLES OF ONLINE DATA CAPTURE COLLECTION
Netnographers can use manual/automated data capture and analysis. Both can be used very effectively.
Manual data collection and analysis involve saving computer files on hard drive and coding either in
document programs like word, or in spreadsheets or database programs. This makes sense when data is
kept to smaller size (500/1000 pages).
Automated data collection, which uses qualitative data analysis software programs can be used to
handle larger and less focused projects with greater amount of data. When the study is a large active
community or when it is exploratory this is the technique to use.
In terms of capturing data there are two ways to capture online data: saving and capturing.The first one
is to save the file as a computer readable file. The second one is to capture it as a visual image on
computer screen. Saving is more appropriate for highly textual messages where the researcher considers
that other elements of the context such as page design, visuals and other elements are unimportant.
Capturing is more appropriate when the opposite is true.
GETTING READY FOR NETNOGRAPHIC DATA COLLECTION
Archiving and accessibility are 2 major differences between online fieldwork and traditional face-to-
face ethnography that create different research environment when it comes to data collection.
HOW DO YOU CHOOSE SPECIFIC SITES FOR NETNOGRAPHY?
Process for netnography:
1. Devise your research question
2. Simple google search using terms and keywords related to the topic
3. Search on twitter
4. Search facebook for relevant groups
5. Click throught the first 2/3 pages of sites or pages
6. Have a look at from 10 to 20 different sites or locations of social media commuunities
7. Try to have at least 5 from blogs and forums, at least 5 from social media like twitter and
facebook
8. Read carefully
9. Write down what you believe to be the most important criteria you should look for in a site you
will investigate
Criteria for selecting social media sites:
1. Relevant. They can inform and clearly link to your stated research question
2. Active. Possessing both recent and regular communications between members
3. Interactive. Manifesting a flow of question-answer or posting-comment responsive
communications between participants in the group
4. Substantial. Offering a critical mass of communications and a lively, energised cultural
atmosphere
5. Heterogeneous. Indicating a good number of different participants
6. Data-rich. Offering data that is significantly detailed or descriptively rich.

Do your homework means that you should thoroughly read through current messages, archives,
community rules and FAQs  fieldnotes
The goal is to be familiar with the social media community members, topics, language, rituals norms
and processes
COLLECTING DATA AND CHOOSING DATA ANALYSIS SOFTWARE
ONLINE INTERVIEWS

You might also like