Content analysis (CA)

2 categories-conceptual (presence and frequencies) and relational (most

difficult and subjective)

Steps for conceptual CA

- Decide level (word, group of words or sentence)

- decide how many different concepts will be encoded (predefined set of codes
or flexibility)
- decide if you encode just the presence or also the frequency of appearance
(allows highlighting of focus)
- decide the level of generalization-Word or the entire family of words , only
explicit or implicit
- develop consistent coding rules-and unitary character for newlines, texts,
phrases, etc. (vital!)
- decide what we do with the information which is "irrelevant" (Weber 1990
suggests ignoring, but it can also be recoded)
- analyzing the results

Steps for the analysis of relational (semantic) CA

Important-There are three sub-categories of RCA:

1) extraction of emotional well-being (affect extraction)-emotional evaluation

of explicit concepts from a text, to which we assign numeric values on the
basis of emotional/psychological scales, in order to obtain a description of
the emotional message transmitted (speaker, writer); Gottschalk 1995
2) proximity analysis-measurement of the number of appearances of pairs of
words (co-appearances or co-occurrences) in a text. The text is viewed as a
single chain of words, delimited by a conventional set of "windows" (a
certain length given by a number of words). We scan all windows for the
full-text, to see how many times certain concepts appear. We obtain a
conceptual matrix (matrix concept) that allows a particular suggestion of a
specific meaning
3) development of cognitive maps (cognitive mapping)-pairs and
relationships previously identified are visually represented in order to allow
comparisons and identify mental patterns; language is the key element
difficulties for foreign languages and artificial intelligence; Palmquist,
Carley & Dale, 1997.

Actual steps:
1. Identify the so-called question (what message he wanted to convey?)
2. The choice of representative text for analysis (neither too much nor too
3. The choice of the three sub-categories of RCA
4. Determine the level of analysis-Word, set of words or sentence
5. Reduction, set text to categories and codes (how often words are used
with double meaning, for example)
6. Exploring the links between concepts-strength, sign and direction
6.a. Strength = the degree/extent to which two or more concepts are
interlinked, concentrated in certain sections of text, etc.
6.b. Sign = concepts can be linked positively or negatively (e.g., bull
market and bear market, an emerging economy and Western economy,
6.c. Direction of the relationship - relationship type = for example, gen X
implies Y, X before Y, if X, then Y, etc., to know to whom belongs the "first
move"; There are also bidirectional relationships or with equal influence

7. Codification of relations (important or filling-the existence of

ambiguous words, for example, may represent the stuffing for thinking or
important information regarding previous affirmations)
8. The development of statistical analyses based on earlier codes, to
identify relationships between variables
9. Representing on a map the concepts and relationships, in chart/graphic

Trust and validity for AC (Cohen, Scott's Pi):

- stability (different moments in time for one encoder or between encoders);

extreme difficulty for "live" situations, not easy to register or unrepeatable;
Krippendorff, 1980, Johnson & Bolstad 1973

- replication (replication or duplication in different circumstances, contexts, different


- accuracy - compliance with a particular standard, deemed a "correct measure"


-Internal validity-classification of schemes or categories as representative for

research hypotheses

- External validity results correspondence from the present with the past and the
future - construct, hypotheses, predictive and semantic

Concepts in CA

Category = group of words with similar meanings or onnotations


- Categories must be mutually exclusive

- Categories must be exhaustive

Coding (encoding) types :

- a priori
- emergent

Defining coding units:

- defining based on physical units, depending on their natural

boundaries (poem, article, speech, etc.)
- defining based on syntaxis (phrase, sentence, paragraph etc.)
- defining based on referrential units (full name vs. label eg.
president vs. nickname etc.)
- defining based on sentence units (breaking initial text into sentences
reflecting basic messages)

Types of units:

- sampling
- context
- registration