You are on page 1of 3

N.L.

P SHORT DEFINITIONS
Collocations:
A collocation is an expression consisting of two or more words that correspond to some
conventional way of saying thing.

Probability:
Probability theory deals with predicting how likely it is that something will happen. For
example, if one tosses three coins, how likely is it that they will all come up heads?
Although our eventual aim is to look at language, we begin with some examples with
coins and dice, since their behavior is simpler and more straightforward

Probabilities are numbers between O and 1, where O indicates impossibility and 1


certainty.

Conditional probability :
Sometimes we have partial knowledge about the outcome of an experiment and that
naturally influences what experimental outcomes are possible. We capture this
knowledge through the notion of conditional probability. This is the updated probability
of an event given some knowledge.

PRIOR PROBABILITY:
The probability of an event before we consider our additional knowledge is called the prior
probability of the event,

POSTERIOR PROBABILITY:
the new probability that results from using our additional knowledge is referred to as the
posterior probability of the event.

CHAIN RULE:
The generalization of this rule to multiple events is a central result that will be used throughout
this book, the chain rule. The chain rule is used in many places in Statistical NLP, such as
working out the properties of Markov models

BAYES’ THEOREM:
Bayed theorem lets us swap the order of dependence between events. That is, it lets us
calculate P(B(A) in terms of P(AIB). This is useful when the former quantity is difficult to
determine. It is a central tool that we will use again and again, but it is a trivial consequence of
the definition of conditional probability and the chain rule introduced.

EXPECTATION mean:
The expectation is the mean or average of a random variable.

Variance:
The variance of a random variable is a measure of whether the values of the random variable
tend to be consistent over trials or to vary a lot. One measures it by finding out how much on
average the variable’s values deviate from the variable’s expectation.

SYNTACTIC:
Linguists group the words of a language into classes (sets) which show similar syntactic
behavior, and often a typical semantic type. These word classes are otherwise called syntactic
or grammatical categories.

Tags:
There are well-established sets of abbreviations for naming these classes, usually
referred to as POS tugs.

Semantics:
Semantics is the study of the meaning of words, constructions, and utterances. We can divide
semantics into two parts, the study of the meaning of individual words (or lexical semantics)
and the study of how meanings of individual words are combined into the meaning of
sentences (or even larger units). One way to approach lexical semantics is to study how word
meanings are related to each other. We can organize words into a lexical hierarchy, as is the
case.

Idiom:
If the relationship between the meaning of the words and the meaning of the phrase is
completely opaque, we call the phrase an idiom.

For example:
the idiom to kick the bucket describes a process, dying, that has nothing to do with kicking and
buckets.

The Good-Turing:
Good (1953) attributes to Turing a method for determining frequency or probability estimates
of items, on the assumption that their distribution is binomial. This method is suitable for large
numbers of observations of data drawn from a large vocabulary, and works well for n-grams,
despite the fact that words and n-grams do not have a binomial distribution.

TAGGING:

Determining the usage of a word in terms of part of speech is referred to as tagging

Hidden Markov Models:


In an HMM, you don t know the state sequence that the model passes through, but only some
probabilistic function of it.

Why use hmm:


HMMs are useful when one can think of underlying events probabilistitally generating surface
events. One widespread use of this is tagging - assigning parts of speech (or other classifiers) to
the words in a text. We think of there being an underlying Markov chain of parts of speech from
which the actual words of the text are generated.

Morphology:
To concern how words constructed from the basic words called morpheme.

Semantic sentence:
The study of relationship between symbol and meaning .study of nonsence of text

Submitted by

Muhammad Azeem

You might also like