Professional Documents
Culture Documents
2. INFORMATION THEORY
Information theory is the mathematical treatment of the concepts, parameters and rules
governing the transmission of messages through communication systems. It was founded by
Claude Shannon toward the middle of the twentieth century and has since then evolved into a
vigorous branch of mathematics fostering the development of other scientific fields, such as
statistics, biology, behavioral science, neuroscience, and statistical mechanics. The techniques
used in information theory are probabilistic in nature and some view information theory as a
branch of probability theory. In a given set of possible events, the information of a message
describing one of these events quantifies the symbols needed to encode the event in an optimal
way. ‘Optimal’ means that the obtained code word will determine the event unambiguously,
isolating it from all others in the set, and will have minimal length, that is, it will consist of a
minimal number of symbols. Information theory also provides methodologies to separate real
information from noise and to determine the channel capacity required for optimal transmission
conditioned on the transmission rate.
Information theory is closely associated with a collection of pure and applied disciplines
that have been investigated and reduced to engineering practice under a variety
of rubrics throughout the world over the past half-century or more: adaptive
systems, anticipatory systems, artificial intelligence, complex systems, complexity
science, cybernetics, informatics, machine learning, along with systems sciences of many
descriptions. Information theory is a broad and deep mathematical theory, with equally broad
and deep applications, amongst which is the vital field of coding theory.
Coding theory is concerned with finding explicit methods, called codes, for increasing the
efficiency and reducing the error rate of data communication over noisy channels to near the
channel capacity. These codes can be roughly subdivided into data compression (source coding)
and error-correction (channel coding) techniques. In the latter case, it took many years to find the
methods Shannon's work proved were possible.
The subscript S underneath the summation simply means to sum over all possible
stimuli S=[1, 2 … 8]. This expression is called “entropy” because it is similar to the definition of
entropy in thermodynamics. Thus, the preceding expression is sometimes referred to as
“Shannon entropy.” The entropy of the stimulus can be intuitively understood as “how long of a
message (in bits) do I need to convey the value of the stimulus?” For example, suppose the
center-out task had only two peripheral targets (“left” and “right”), which appeared with an equal
probability. It would take only one bit (a 0 or a 1) to convey which target appeared; hence, you
would expect the entropy of this stimulus to be 1 bit. That is what the preceding expression gives
you, as P(S)=0.5 and log2(0.5)=−1. The center-out stimulus in the dataset can take on eight
possible values with equal probability, so you expect its entropy to be 3 bits. However, the
entropy of the observed stimuli will actually be slightly less than 3 bits because the observed
probabilities are not exactly uniform.
Next, you want to measure the entropy of the stimulus given the response, H(S|R). For
one particular stimulus, the entropy is defined similarly to the previous equation:
H(S|r) = −∑S P(s|r)log2 P(s|r)
To get the entropy H(S|R), you just average over all possible responses:
H(S|R) = −∑R∑S P(r)P(s|r)log2 P(s|r)
Now you can define the information that the response contains about the stimulus. This is known
as mutual information (designated I), and it is the difference between the two entropy values just
defined:
I(R;S) = H(S) − H(S|R) =∑R∑S P(r)P(s|r)log2 [P(s|r)/P(s)]
Why does this make sense? Imagine you divide the response into eight bins and that each
stimulus is perfectly paired with one response. In this case, the entropy H(S|R) would be 0 bits,
because given the response, there is no uncertainty about what the stimulus was. You already
decided the H(S) was theoretically 3 bits, so the mutual information I(R;S) would be 3 bits – 0
bits=3 bits. This confirms that the response has perfect information about the stimulus.
Suppose instead that you divide the response into two bins, and that one bin corresponds to
stimuli 1–4 and the other bin corresponds to stimuli 5–8. Each bin has four equally likely
choices, so the entropy H(R|S) will be 2 bits. Now the mutual information is I(R;S)=3 bits – 2
bits=1 bit. This means that response allows you to reduce the uncertainty about the stimulus by a
factor of 2. This makes sense because the response divides the stimuli into two equally likely
groups. This also emphasizes that the choice of bins affects the value of the mutual information.
Note that you can use the definition of conditional probability to rearrange the expression for
mutual information. The following version is easier to use with the table of joint and marginal
probabilities computed earlier. Mutual information can also be defined as follows:
Applying this equation to the joint distribution of the sample neuron gives a mutual information
of 0.50 bits for a rate code.
3. MEASUREMENT OF READABILITY
Speed of perception
Perceptibility at a distance
Perceptibility in peripheral vision
Visibility
Reflex blink technique
Rate of work (reading speed)
Eye movements
Fatigue in reading
Cognitively-motivated features
Word difficulty
N-gram analysis
In the reader, those features affecting readability are 1. prior knowledge, 2. reading skill,
3. interest, and 4. motivation. In the text, those features are 1. content, 2. style, 3. design, and 4.
structure. The design can include the medium, layout, illustrations, reading and navigation aids,
typeface, and color. Correct use of type size, line spacing, column width, text-colorbackground
contrast and white space make text easy to read.
Readability Formula
Readability formulas, are formula for evaluating the readability of text, usually by
counting syllables, words, and sentences. Readability tests are often used as an alternative to
conducting an actual statistical survey of human readers of the subject text (a readability survey).
Word processing applications often have readability tests in-built, which can be deployed on
documents in-editing.
The application of a useful readability test protocol will give a rough indication of a
work's readability, with accuracy increasing when finding the average readability of a large
number of works. The tests generate a score based on characteristics such as statistical average
word length (which is a used as a proxy for semantic difficulty) and sentence length (as a proxy
for syntactic complexity) of the work.
Some readability formulas refer to a list of words graded for difficulty. These formulas
attempt to overcome the fact that some words, like "television", are well known to younger
children, but have many syllables. In practice, however, the utility of simple word and sentence
length measures make them more popular for readability formulas.[citation needed] Scores are
compared with scales based on judged linguistic difficulty or reading grade level. Many
readability formulas measure word length in syllables rather than letters, but only SMOG has a
computerized readability program incorporating an accurate syllable counter.
Since readability formulas do not take the meanings of words into account, they are not
considered definitive measures of readability.
Formula is considered as one of the oldest and most accurate readability formulas.
Rudolph Flesch, an author, writing consultant, and a supporter of the Plain English Movement,
developed this formula in 1948. Raised in Austria, Rudolph Flesch studied law and earned a
Ph.D. in English from the Columbia University. Flesch, through his writings and speeches,
advocated a return to phonics. In his article, A New Readability Yardstick, published in the
Journal of Applied Psychology in 1948, Flesch proposed the Flesch Reading Ease Readability
Formula.
RE = Readability Ease
ASL = Average Sentence Length (i.e., the number of words divided by the number of
sentences)
ASW = Average number of syllables per word (i.e., the number of syllables divided by
the number of words)
The output, i.e., RE is a number ranging from 0 to 100. The higher the number, the easier
the text is to read. Scores between 90.0 and 100.0 are considered easily understandable by an
average 5th grader. Scores between 60.0 and 70.0 are considered easily understood by 8th and
9th graders. Scores between 0.0 and 30.0 are considered easily understood by college graduates
If we were to draw a conclusion from the Flesch Reading Ease Formula, then the best
text should contain shorter sentences and words. The score between 60 and 70 is largely
considered acceptable. The following table is also helpful to assess the ease of readability in a
document:
The Gunning Fog Index Readability Formula, or simply called FOG Index, is attributed
to American textbook publisher, Robert Gunning, who was a graduate from Ohio State
University. Gunning observed that most high school graduates were unable to read. Much of this
reading problem was a writing problem. His opinion was that newspapers and business
documents were full of “fog” and unnecessary complexity. Gunning realized the problem quite
early and became the first to take the new readability research into the workplace. Gunning
founded the first consulting firm specializing in readability in 1944. He spent the next few years
testing and working with more than 60 large city daily newspapers and popular magazines,
helping writers and editors write to their audience
Step 1: Take a sample passage of at least 100-words and count the number of exact words and
sentences.
Step 2: Divide the total number of words in the sample by the number of sentences to arrive at
the Average Sentence Length (ASL).
Step 3: Count the number of words of three or more syllables that are NOT (i) proper nouns, (ii)
combinations of easy words or hyphenated words, or (iii) two-syllable verbs made into three
with -es and -ed endings.
Step 4: Divide this number by the number or words in the sample passage. For example, 25 long
words divided by 100 words gives you 25 Percent Hard Words (PHW).
Step 5: Add the ASL from Step 2 and the PHW from Step 4.
where,
ASL = Average Sentence Length (i.e., number of words divided by the number of sentences)
The underlying message of The Gunning Fog Index formula is that short sentences
written in Plain English achieve a better score than long sentences written in complicated
language.
The ideal score for readability with the Fog index is 7 or 8. Anything above 12 is too hard
for most people to read. For instance, The Bible, Shakespeare and Mark Twain have Fog Indexes
of around 6. The leading magazines, like Time, Newsweek, and the Wall Street Journal average
around 11..
Media is the communication outlets or tools used to store and deliver information or data.
The term refers to components of the mass media communications industry, such as print media,
publishing, the news media, photography, cinema, broadcasting (radio and television), digital
media, and advertising.
The Term media in its modern application relating to communication channels was first
used by Canadian communications theorist Marshall McLuhan, who stated in Counterblast
(1954): "The media are not toys; they should not be in the hands of Mother Goose and Peter Pan
executives. They can be entrusted only to new artists because they are art forms." By the mid-
1960s, the term had spread to general use in North America and the United Kingdom. The phrase
"mass media" was, according to H.L. Mencken, used as early as 1923 in the United States.
The term "medium" (the singular form of "media") is defined as "one of the means or
channels of general communication, information, or entertainment in society, as newspapers,
radio, or television."
Media technology has made viewing increasingly easier as time has passed throughout
history. Children today are encouraged to use media tools in school and are expected to have a
general understanding of the various technologies available. The internet is arguably one of the
most effective tools in media for communication tools such as e-mail, Skype, and Facebook have
brought people closer together and created new online communities. However, some may argue
that certain types of media can hinder face-to-face. Therefore, it is an important source of
communication.
In a large consumer-driven society, electronic media (such as television) and print media
(such as newspapers) are important for distributing advertisement media. More technologically
advanced societies have access to goods and services through newer media than less
technologically advanced societies. In addition to this "advertising" role, media is nowadays a
tool to share knowledge all around the world. Analysing the evolution of medium within the
society, Popkin (2006) assesses the important role of media, by building connection between
politics, culture and economic life and the society: for instance periodical newspaper has been an
opportunity to first advertise and second to be up-to-date with current foreign affairs or the
nation economic situation. In the meantime, Willinsky (2008) was promoting the role of modern
technology as a way to come across cultural, gender, national barriers. He saw the internet as an
opportunity to establish a fair and equal system of knowledge: as internet may be accessible to
anyone, any published information may be read and consulted by anyone. Therefore, the internet
is a sustainable solution to overcome the "gap" between developed and developing countries as
both will get a chance to learn from each other. Canagarajah (2017) is addressing the issue of
unbalanced relations between the North and South countries, asserting that Western countries
tend to impose their own ideas on developing countries. Therefore, internet is way to re-establish
balance, by for instance enhance publication of newspaper, academic journal from developing
countries. Christen(2013] is the one who created a system that provide access to knowledge and
protect people's customs and culture. Indeed, in some traditional societies, some genders cannot
have access to a certain type of knowledge therefore respecting these customs limit the scope of
dissemination but still allow the diffusion of knowledge. Within this process of dissemination,
media would play a role of "intermediaries", that is say translation an academic research into a
journalistic format, accessible by lay audience ( Levin, 2016]). Consequently, media is a modern
form of communication aiming at spreading knowledge within the whole world, regardless any
form of discrimination.
Media, through media and communications psychology, has helped to connect diverse
people from far and near a geographical location. It has also helped in the aspect of on-line or
Internet businesses and other activities that have an on-line version. All media intended to affect
human behavior is initiated through communication and the intended behavior is couched in
psychology. Therefore, understanding media and communications psychology is fundamental in
understanding the social and individual effects of media. The expanding field of media and
communications psychology combines these established disciplines in a new way.
Timing change based on innovation and efficiency may not have a direct correlation with
technology. The information revolution is based on modern advancements. During the 19th
century, the information "boom" rapidly advanced because of postal systems, an increase in
newspaper accessibility, as well as schools "modernizing". These advancements were made due
to the increase of people becoming literate and educated. The methodology of communication
although has changed and dispersed in numerous directions based on the source of its
sociocultural impact. Biases in the media that affect religious or ethnic minorities take the form
of racism in the media and religious bias in the media.