You are on page 1of 12

16 PERSONALITY FACTORS (16PF) TEST

NORMS

The 16 Personality Factor Test was directed to a substantial group (N = 4,449), and
followed by the stratified random sampling in which it was employed to establish the conclusive
normative sample of 22,500. The sample stratification was performed on the source of the
following variables: gender, race, age, and education, along with the target figure for every
variable stemmed from 1990 U.S. Census figures (Conn & Rieke, in press-a); (Rivera, n.d).

The ensuing result summarizes the norm sample demographics: The size of the norm
sample is two thousand and five hundred (2500): one thousand and two hundred forty-five
(1245) males and one thousand and two hundred fifty-five (1255) females which break in the
percentage 49.8% male and 50.2% female. Ages varied from fifteen (15) to ninety-two (92) in
consort with a mean age of 33.3 years. The sample was 8.4% Caucasian, 12.8% African
American, 3.0% Asian American, 2.3% Native American, and 9.0% Hispanic. Approximately
16% of those in the sample reside in Northeastern states, 15% in Southeastern states, 28% in
North Central states, 14% in South Central states, and 24% in Western states.

The 16 PF's basic (raw) scores are changed to a standard (sten) score using a norm
table that is incorporated with the arranged hand scoring keys. Stens are built on a 10-point
scale having a mean of 5.5 and a standard deviation of 2. Every single question of the test loads
pertains to one of the sixteen factors. Therefore, it is easy to comprehend the percentages of
the answers per factor using which indicates how to convert transform raw scores to percentile
rank.

RELIABILITY

The test-retest coefficients suggest present data of the constancy over time of the
various traits evaluated by the 16PF. Pearson Product-Moment Correlations were analyzed for
two-week and two-month test-retest intervals. Examinees for the two-week interval were two
hundred and four 204 (77 male, 127 female) university undergraduate and graduate students.
Their mean age was 20.5 years, and their mean education level was 13.8 years. Reliability
coefficients intended for the main factors varied from .69 (Reasoning, Factor B) to .86 (Self-
Reliance, Factor Q2), with a mean of .80. Test-retest coefficients intended for the global factors
were greater, varying from .84 to .9 taking a mean of .87.

While for the two-month interval, the sample comprised of one hundred and fifty-nine
(159) university undergraduates (34 male, 125 female). The mean age of the said group was
18.8 years, and the mean education level was 12.6 years. For the main factors, reliability
coefficients varied from .56 (Vigilance, Factor L) to .79 (Social Boldness, Factor H), with a mean
of .70. Test-retest coefficients used for the global factors varied from .70 to .82, containing a
mean of .78.

INTERNAL CONSISTENCY
As an assessment of scale internal consistency, Cronbach's alpha coefficient for the
16PF was assessed on the widespread population norm sample of two thousand and five
hundred (2,500) adults. Values varying from .64 (Openness to Change, Factor Q1) to .85
(Social Boldness, Factor H), having an average of .74.

VALIDITY

The construct validity of the 16PF Fifth Edition shows that the test measures sixteen
various personality traits. Criterion validity of the 16PF Fifth Edition is displayed by its capacity
to predict different criterion scores. Correlation studies concerning the 16PF Fifth Edition and
the Coopersmith Self-Esteem Inventory, Bell's Adjustment Inventory, and the Social Skills
Inventory were disclosed. The aforementioned evaluate self-esteem, adjustment, and social
skills, individually.

Another instance is the administering of the 16PF-Fifth Edition and the Myers-Briggs
Type Indicator (MBTI). The MBTI is composed of a 126-item questionnaire derived from Carl
Jung's theory of Psychological Types (Jung, 1971). This theory conceives that individual
differences mark the extent to which individuals prefer certain styles of judgment and
perception. The MBTI recognizes four bipolar psychological indicators: Extroversion-
Introversion, Sensing-Intuitive, Thinking-Feeling, and Judging-Perceptive. Thus, the
combinations of these four preferences form the sixteen personality "types."

The 16PF-Fifth Edition and the MBTI were processed to a sample of one hundred and
nineteen (119) university students (42 of them are males, 77 of them are females; mean age of
25.3 years; mean education level of 14.4 years). As projected, the 16PF Fifth Edition's
Extroversion global factor correlates positively on the MBTI Extroversion type (r = .68), and
negatively, on the Introversion type (r = -.61). All the primary factor scales including 16PF
Extroversion reveal this pattern. To a lesser degree, the Extroversion global factor correlates
positively with Feeling that judgments lean to be personal and subjective (r = .19), and
negatively with Thinking, that judgments are impersonal and objective as well. (r= -.18).

At the main level, Liveliness (F), Social Boldness (H), and Self-Reliance (Q2) merely
correlate to MBTI's Extroversion and Introversion. Warmth (Factor A) correlates negatively with
Thinking (r = -.32) and positively to Feeling (r = .24), whereas Privateness (Factor N) shows the
opposite pattern, correlating positively to Thinking (r-.27) and negatively to Feeling (r = -.23).
Warmth (A) and Privateness (N) are more intensely linked to the MBTI Extroversion-Introversion
scales.

The Anxiety global factor correlates negatively to Extroversion (r = -.38) and positively to
Introversion (r = .23), feasibly showing aspects of social desirability usual to low anxiety and
high Extroversion. At this primary level, only Emotional Stability (C +) and Apprehension (0 +)
correlate significantly to any of the MBTI types. Emotional Stability correlates positively to MBTI
Extroversion (r = .36) and negatively to MBTI Introversion (r = -.23). Apprehension (0 +)
correlates negatively to MBTI Extroversion (r = -.32), but is not considerably connected to MBTI
Introversion. Anxiety does not correlate to any other MBTI scales.
The above information is an illustration of how the 16PF-Fifth Edition has done hand-in-
hand with more tests to present its criterion validity value. As a test of normal personality,
though, the 16PF Fifth Edition has a controlled range of prediction value. That is, while
personality is an essential determinant of particular behaviors, other aspects of a person (e.g.,
motivation, interests, ability) are also essential in the projection of future behavior. In an
associated vein, the 16PF-Fifth Edition must never be the only basis for decision-making or
selection, though it can be beneficial as a constituent of a selection battery.

CONCLUSION

Anchored from the journal article entitled ‘A Critique of the 16 Personality Factors (5th
ed.), this revised version persists to measure the same sixteen primary personality factors with
enhanced reliability and validity. The 16PF (Fifth Edition) also incorporates five global factors.
The face validity of this instrument is excellent. Norms were made with a final normative sample
of 22,5000, and test-retest coefficients have presented the indication of stability over time of the
following traits measured by the 16PF.
BIG FIVE PERSONALITY TEST

NORMS
The test quantifies the five primary components of personality and the thirty underlying
facets. This yields it as a scientific instrument, which has a high degree of reliability and validity,
and a representative and lately assembled norm group applied. The whole dataset consists
of four hundred ninety-one (491) respondents. Built on criteria such as: age, sex, educational
level, nationality, labor market position, completion time and response variation, the last subset
is generated on which analyses are made and with which the final norm is evaluated (Thiel,
2021).

Exploration and selection of norm data:


 Age: An age group ranging from 18-67 years old has been selected, because this group
denotes the employed population of the Western world.
 Sex: The sex of all respondents is recognized as this background question was required.
Striking is the greater number of women who performed the test. Rationally, both sexes
are involved in the dataset because this group best signifies the employed population of
the Western world.
 Education: Because the Big Five Personality Test is expressly established for average to
higher educated people, it was agreed to pick a figure of education levels to be
incorporated in the dataset.
 Nationality: The figures of nationalities denoted in the original dataset is huge: two
hundred and seventeen (217) countries, dependencies and territories were denoted with
more than ten respondents. Because the Big Five Personality Test is amplified for the
English-speaking population of the Western world, a country selection was done.
 Labor market position: The respondent was inquired about his/her labor market position.
These are the only labor market positions included:  Salaried employment, Self-
employed/Freelancer and Officially unemployed were employed in the dataset, because
this group best signifies the labor market population of the Western world.
 Work/Sector Industry: The respondent was requested to put in which working sector
he/she works. A selection could be assembled from the twenty-three work sectors
integrated in the model of EurOccupations (Wageindicator.org, 2009).
 Completion Time: Seeing the duration of completion of the test is a fine way to identify
how truly a respondent has answered the questionnaire. It was agreed to select between
five and forty-five minutes in the final dataset.
 Response Variation: Exploring at the respondents’ response variation is also a way to
aim how truly a respondent has answered the questionnaire. It was agreed to simply
include a response variation of five in the final dataset. 
 Human Response: Online questionnaires can experience crawlers and bots who plug in
the questionnaires repeatedly. By integrating a consistency measure, the researcher can
reject responses that are not reliable from the dataset. The consistency
evaluate psychometric synonym (Craig & Meade, 2012) has been operated to determine
artificial and random responses. This consistency measure is quantified by first selecting
entire item pairs that correlate [> .60] across the whole dataset. In this dataset, nine item
pairs are chosen. Next, for every respondent the psychometric synonym score is
quantified which is equal to the within-person correlation of the selected item pairs. The
cut-off value of 0.2 used by Meade & Craig (2012) was employed to sort artificial
responses and responses with a random response pattern from the dataset. 

For a proper norm group, the dataset should properly show the designed group of users,
in this case the Western world labor force. Because a dataset roughly never has the similar
composition as the proposed user group, weighing is integrated. The dataset was weighted
based on the distribution in the table below:

Criterium Groups Population


Sex Female 50.50%
Male 49.50%
Education Average education 62.60%
Higher education 37.40%
Age 15-24 17.40%
25-44 45.50%
45-64 37.10%

A much-used standard for norm groups for use in ‘advisory’ situations is that the norm
group must comprise of [>200] respondents. For recruitment and selection purposes, this is
[>400]. In this dataset, fifteen thousand and one hundred seven (15107) respondents are
involved and thus very obviously encounters this standard (Thiel, 2021).

Group Differences. If there are significant differences between relevant groups within a norm
group, this could and should be corrected by using separate norm groups. For comparing group
averages and determining effect size, Cohen’s d is used (Cohen, 1992). Effect sizes close to
zero are small, effect sizes larger than 0.8 or smaller than -0.8 are often considered large. To
determine whether norms are needed for specific groups, group differences between the sexes
have been examined:
Factor Cohen’s D (Male-Female)
 Sex: Given that no impact Openness to experience 0.128
sizes of -0.8 or 0.8 have Conscientiousness 0.121
been found, it can be Extraversion -0.024
concluded that the use of a Agreeableness 0.543
single norm for the sexes is Natural reactions 0.252
justified.
Factor Cohen’s D (Old-Young)
Openness to experience 0.156
Conscientiousness -0.724
Extraversion -0.151
Agreeableness -0.552
Natural reactions 0.705
 Age: Given that no impact sizes of -0.8 or 0.8 have been found, it can be concluded that
the use of a single norm for age is justified.

Factor (University-Highschool)
 Level of education: Given Openness to experience -0.330
that no impact sizes of -0.8 Conscientiousness -0.410
or 0.8 have been found, it Extraversion -0.242
can be concluded that the Agreeableness -0.312
use of a single standard for Natural reactions 0.334
education level is justified.

RELIABILITY

Cronbach’s alpha (Cronbach and Shavelson 2004) is a measure of the reliability of


psychometric tests or questionnaires. The value of alpha is an estimate for the lower limit of
reliability of the test in question (Thiel, 2021).

Factors: An often-used criterion for instruments used in advisory situations is that the reliability
coefficient of Cronbach’s alpha should not be lower than .60. Scores higher than .80 are
assessed as ‘good’. On average across the five factors, the reliability coefficient is 0.88, which
may be considered very high.
Factors Item count Cronbach’s Alpha
Openness to experience 24 0.81564
Conscientiousness 24 0.90888
Extraversion 24 0.89249
Agreeableness 24 0.86265
Natural reactions 24 0.91921

If an item does not correlate sufficiently with the other items of the same factor, it
damages the reliability of said factor.

Facets: The average Cronbach’s Alpha of the 30 facets is 0.753, which is a good performance
considering the length of the scales.

Factors Facets Items Cronbach’s Alpha


Openness to experience Facet: Imagination 4 0.77087
Facet: Artistic interests 4 0.73223
Facet: Depth of emotions 4 0.65264
Facet: Willingness to experiment 4 0.66261
Facet: Intellectual curiosity 4 0.69076
Facet: Tolerance for diversity 4 0.49487
Conscientiousness Facet: Sense of competence 4 0.72658
Facet: Orderliness 4 0.83003
Facet: Sense of responsibility 4 0.69670
Facet: Achievement striving 4 0.75943
Facet: Self-discipline 4 0.74301
Facet: Deliberateness 4 0.86425
Extraversion Facet: Warmth 4 0.80829
Facet: Gregariousness 4 0.81566
Facet: Assertiveness 4 0.86924
Facet: Activity level 4 0.71323
Facet: Excitement seeking 4 0.65720
Facet: Positive emotions 4 0.81739
Agreeableness Facet: Trust in others 4 0.84768
Facet: Sincerity 4 0.74710
Facet: Altruism 4 0.73092
Facet: Compliance 4 0.66146
Facet: Modesty 4 0.73948
Facet: Sympathy 4 0.72971
Natural reactions Facet: Anxiety 4 0.82595
Facet: Angry hostility 4 0.86720
Facet: Moodiness/Contentment 4 0.85947
Facet: Self-consciousness 4 0.70584
Facet: Self-indulgence 4 0.76216
Facet: Sensitivity to stress 4 0.79752

CONSTRUCT VALIDITY:
Screeplot: In factor analysis,
a screeplot or eigenvalue
diagram is a graph in which the
eigenvalues of the possible
variables for the factors are
plotted in order of decreasing
magnitude.
In the table below you
can see that there are 5 clear
components (PC) with an
eigenvalue > 1.0. This
corresponds with well-known
scientific literature which states
that personality contains 5
components.

Principal Components Analysis: Principal component analysis is a multivariate method of


analysis in statistics to describe a large amount of data with a smaller number of relevant
quantities, the main components or principal components.
The table below shows the results of a PCA with varimax rotation. The 30 facets can
clearly be reduced to the five components to which they belong according to the theoretical
model of the Big Five. The dominant factor Extraversion attracts a lot of variance, especially in
the form of negative charges of Natural reactions. There are only a number of facets that have a
higher primary charge on another component.
All in all, the analysis shows a very recognizable and satisfactory picture.

Factor C Facet R R R R R
o C C C C C
d 1 2 3 4 5
e
Extrav E Facet: 0
ersion 1 Warmth .
8
2
8
E Facet: 0
2 Gregario .
usness 8
3
2
E Facet: 0 0
3 Assertive . .
ness 4 5
8 1
1 3
E Facet: 0 0
4 Activity . .
level 4 4
3 9
6 3
E Facet: 0
5 Exciteme .
nt 5
seeking 7
4
E Facet: 0
6 Positive .
emotions 6
9
Consci C Facet: 0
entiou 1 Sense of .
sness compete 8
nce 0
1
C Facet: 0
2 Orderline .
ss 5
7
8
C Facet: 0
3 Sense of .
responsi 5
bility 2
9
C Facet: 0
4 Achieve .
ment 7
striving 2
7
C Facet: 0
5 Self- .
discipline 7
8
4
C Facet: 0 -
6 Deliberat . 0
eness 4 .
3 5
2 7
4
Agree A Facet: 0 0
ablene 1 Trust in . .
ss others 4 4
5 0
4 7
A Facet: 0
2 Sincerity .
6
5
A Facet: 0
3 Altruism .
7
9
3
A Facet: 0
4 Complia .
nce 5
9
5
A Facet: 0
5 Modesty .
5
5
6
A Facet: 0
6 Sympath .
y 7
0
6
Natura N Facet: 0
l 1 Anxiety .
reactio 6
ns 9
N Facet: 0
2 Angry .
hostility 7
0
8
N Facet: 0
3 Moodine .
ss/Conte 5
ntment 5
2
N Facet: -
4 Self- 0
consciou .
sness 7
2
9
N Facet: 0
5 Self- .
indulgen 5
ce 0
3
N Facet: 0
6 Sensitivit .
y to 6
stress 3
8
Openn O Facet: 0
ess to 1 Imaginati .
experi on 5
ence 8
7
O Facet: 0
2 Artistic .
interests 6
8
9
O Facet: 0
3 Depth of .
emotions 7
0
1
O Facet: 0
4 Willingne .
ss to 5
experime 4
nt 3
O Facet: 0
5 Intellectu .
al 7
curiosity 4
6
O Facet: 0
6 Toleranc .
e for 5
diversity 9
5

CONCLUSION

The results of this study show that the Big Five Personality Test is a reliable and valid
instrument with a solid norm to be used among Western world respondents with an average to
higher educational level, with an age between 18 and 67 years for self-analysis, in career
guidance or in other professional settings (Theil, 2021).

 The results of this study show that the Big Five Personality Test scores well to very well
on the reliability coefficients commonly used in science.
 The results of this study show that the Big Five Personality Test shows good construct
validity of the measured constructs.
 The results of this study show that the Big Five Personality Test has a good norm that
shows no differences between groups.

(Note: Apologies for all the tables, figures, graph and interpretations starting from Group Differences that
are copied directly in the following references because this student is not knowledgeable enough to
create an originality report and does not know if she made it properly hehhehhehe)

REFERENCES

Cattell, R. B., Cattell, A. K. S., & Cattell, H. E. P. (1994). 16 PF (5th ed). Institute for Personality
and Ability Testing, Inc., Champaign, IL

Cohen, J. (1992). “A Power Primer.” Psychological Bulletin 112 (1). American Psychological


Association: 155.

Conn, S.R. & Rieke, M. L. (in press a). Characteristics of the norm sample. In S. R. Conn & M.
L. Rieke (Eds.), The 16PF Fifth Edition technical manual. Champaign, IL: Institute for
Personality and Ability Testing, Inc.

Conn, S. R. & Rieke, M. L. (Eds). (in press-f) The 16PF Fifth Edition technical manual.
Champaign, IL: Institute for Personality and Ability Testing, Inc.

Cronbach, L. J. & Shavelson, R. (2004). “My Current Thoughts on Coefficient Alpha and
Successor Procedures.” Educational and Psychological Measurement 64 (3): 391–418.
Meade, A. W. & Bartholomew C. (2012). “Identifying Careless Responses in Survey
Data.” Psychological Methods 17 (3). American Psychological Association: 437.

Rivera, H. (n.d.). A Critique of the 16 th Personality Factors-Fifth Ed. Retrieved from


https://files.eric.ed.gov/fulltext/ED401304.pdf

Thiel, E. (2021). “Technical Documentation Personality Test”. 123Test. Retrieved from


https://www.123test.com/research/big-five-personality-test-webversion/

Wageindicator.org. (2009). “EurOccupations.”


2. Discuss your personal experience while and after you take the different test and report
your impression of the results.

In our class, we were ordered to take different tests wherein we would critic for ourselves
how correct the test was based on our personalities and personal experiences. I think all of
those tests taken as a whole did a quite accurate performance in measuring different factors of
my well-being. There are some disagreements in my mind with some items in every test that I
barely observed as a measurement of overt behavior.
Social Intelligence Test, for example, the website disclosed that the test was developed
in Great Britain and the images I have seen are very difficult to comprehend and those images
were taken from British magazines in 1990's knowing that I was born in the year of 2000. I was
not a native speaker of English and there are terms that are hardly defined so I think the said
test doesn't work perfectly for me who came from a culture which was different from Britain's.
Perhaps, the results of this test are beneficial when they are averaged throughout many people,
but they can be also inaccurate for any individual as well.
On the other hand, the rest of the test seemed to be valid for the most part relating to my
case, and in the meantime, validity correlates with reliability, its reliability presents to be quite
strong. However, this is simply an opinion derived from my personal interpretation of my
personality. It could be similar likely that there are other people may obtain high scores in which
they utterly disagree on. Since we are now in 21 st century, the question of re-standardization of
each test particularly found in Internet also comes into consideration.
As for cross-cultural distinctions discussed previously, it is uncertain whether or not the
questions are general to all various cultures and backgrounds, yet I have observed questions
appeared well-structured especially when it comes from English-speaking countries. In addition,
the screen lighting level, technical difficulties, mood, fatigue and the environment during I
answered the test and many other factors might have affected my scores. While no test is
absolutely perfect, we have to consider if it indicates to have relatively high reliability and
validity, making it great in measuring the individual.

You might also like