You are on page 1of 5

LEARNING INSIGHTS #4

Name of Student: JENNIFER D. SAYONG


Program/Course: DEVELOPMENTAL EDUCATION
Subject: ED. 703 MULTIVARIATE ANALAYSIS
Professor: DR. MA. MYRNA PEPITO
Date: April 17, 2021

Topic/Topics Discussed: BISERIAL COEFFICIENT


TETRACHORIC COEFFICIENT
RANK-BISERAL COEFFICIENT

Reporter: MYNAH BELLE CONCEPTION Q. ORTIGUESA

Learning Insights:

 The Point-Biserial Correlation Coefficient is a correlation measure of the strength


of association between a continuous-level variable (ratio or interval data) and a
binary variable.  Binary variables are variables of nominal scale with only two
values.  They are also called dichotomous variables or dummy variables in
Regression Analysis.  Binary variables are commonly used to express the
existence of a certain characteristic (e.g., reacted or did not react in a chemistry
sample) or the membership in a group of observed specimen (e.g., male or
female).  If needed for the analysis, binary variables can also be created
artificially by grouping cases or recoding variables.  However it is not advised to
artificially create a binary variable from ordinal or continuous-level (ratio or scale)
data because ordinal and continuous-level data contain more variance
information than nominal data and thus make any correlation analysis more
reliable.  For ordinal data use the Spearman Correlation Coefficient rho, for
continuous-level (ratio or scale) data use Pearson’s Bivariate Correlation
Coefficient r.  Binary variables are also called dummy.  The Point-Biserial
Correlation Coefficient is typically denoted as r pb .
 Tetrachoric correlation is used to measure rater agreement for binary data;
Binary data is data with two possible answers—usually right or wrong. The
tetrachoric correlation estimates what the correlation would be if measured on a
continuous scale. It is used for a variety of reasons including analysis of scores
in Item Response Theory (IRT) and converting comorbity statistics to correlation
coefficients. This type of correlation has the advantage that it’s not affected by
the number of rating levels, or the marginal proportions for rating levels.
 The term “tetrachoric correlation” comes from the tetrachoric series, a numerical
method used before the advent of computers. While it’s more common to
estimate correlations with methods like maximum likelihood estimation, there is a
basic formula you can use.

Name of Student: JENNIFER D. SAYONG


Program/Course: DEVELOPMENTAL EDUCATION
Subject: ED. 703 MULTIVARIATE ANALAYSIS
Professor: DR. MA. MYRNA PEPITO
Date: April 17, 2021

Topic/Topics Discussed: Coefficient of correlation

Reporter: ENCISO T. ENRIQUE, JR.

Learning Insights:

 Correlation coefficients are used to measure how strong a relationship is


between two variables. There are several types of correlation coefficient, but
the most popular is Pearson’s. Pearson’s correlation (also called
Pearson’s R) is a correlation coefficient commonly used in linear
regression. If you’re starting out in statistics, you’ll probably learn about
Pearson’s R first. In fact, when anyone refers to the correlation coefficient,
they are usually talking about Pearson’s.
 Correlation between sets of data is a measure of how well they are related.
The most common measure of correlation in stats is the Pearson Correlation.
The full name is the Pearson Product Moment Correlation (PPMC). It
shows the linear relationship between two sets of data. In simple terms, it
answers the question, Can I draw a line graph to represent the data? Two
letters are used to represent the Pearson correlation: Greek letter rho (ρ) for a
population and the letter “r” for a sample.
Name of Student: JENNIFER D. SAYONG
Program/Course: DEVELOPMENTAL EDUCATION
Subject: ED. 703 MULTIVARIATE ANALAYSIS
Professor: DR. MA. MYRNA PEPITO
Date: April 17, 2021

Topic/Topics Discussed: MULTIVARIATE NORMAL DISTRIBUTION

Reporter: MIRASOL MENDOZA

LEARNING INSIGHTS:

o Normal distribution is one of the most widely encountered distributions.


One of the main reasons is that the normalized sum of independent
random variables tends toward a normal distribution, regardless of the
distribution of the individual variables (for example you can add a bunch of
random samples that only takes on values -1 and 1, yet the sum itself
actually becomes normally distributed as the number of sample you have
becomes larger). This is known as the central limit theorem. But when you
have several normal distributions, the situation becomes a little more
complicated.
o The multivariate normal distribution is among the most important of
multivariate distributions, particularly in statistical inference and the study
of Gaussian processes such as Brownian motion. The distribution arises
naturally from linear transformations of independent normal variables. In
this section, we consider the bivariate normal distribution first, because
explicit results can be given and because graphical interpretations are
possible. Then, with the aid of matrix notation, we discuss the general
multivariate distribution.
Name of Student: JENNIFER D. SAYONG
Program/Course: DEVELOPMENTAL EDUCATION
Subject: ED. 703 MULTIVARIATE ANALAYSIS
Professor: DR. MA. MYRNA PEPITO
Date: April 17, 2021

Topic/Topics Discussed: MEASURES OF CENTRAL TENDENCY AND


QUARTILES

Reporter: IRENE MANGLE

LEARNING INSIGHTS:

 A measure of central tendency is a single value that attempts to describe a set of


data by identifying the central position within that set of data. As such, measures
of central tendency are sometimes called measures of central location. They are
also classed as summary statistics. The mean (often called the average) is most
likely the measure of central tendency that you are most familiar with, but there
are others, such as the median and the mode.

 The mean, median and mode are all valid measures of central tendency, but
under different conditions, some measures of central tendency become more
appropriate to use than others. In the following sections, we will look at the mean,
mode and median, and learn how to calculate them and under what conditions
they are most appropriate to be used.

 Mean (Average): Represents the sum of all values in a dataset divided by the


total number of the values.
 Median: The middle value in a dataset that is arranged in ascending order (from
the smallest value to the largest value). If a dataset contains an even number of
values, the median of the dataset is the mean of the two middle values.
 Mode: Defines the most frequently occurring value in a dataset. In some cases, a
dataset may contain multiple modes while some datasets may not have any
mode at all.

You might also like