You are on page 1of 5

Yuhua (Jake) Liang (2015), “Responses to Negative Student Evaluations on

RateMyProfessors.com: The Effect of Instructor Statement of Credibility on Student


Lower- Level Cognitive Learning and State Motivation to Learn”, Communication
Education, 64:4, pp. 455-471.

Instructor Statement of Credibility: In general, credibility refers to believability. However,


instructional research has defined instructor credibility according to three dimensions:
competence, caring, and trustworthiness (McCroskey & Teven, 1999)

Student Lower- Level Cognitive Learning: Cognitive learning generally refers to the extent
to which students retain information (Bloom, 1956). Bloom distinguishes higher
(intellectual abilities) and lower-level (recall) cognitive learning. However, the current
research is concerned only with how much students recall and remember after reading
content on RMP. 461

State Motivation to Learn: The second student outcome is student state motivation to learn.
This concept refers to students’ tendency toward learning and finding the content useful
(Brophy, 1987). Students who view instructor responses are likely to react by developing
more favorable tendency toward learning the content.

Indeed, the manner in which RMP affects students extends beyond motivation and
enrollment to student cognitive learning (Edwards, Edwards, Shaver, & Oaks, 2009) 456

Why does RMP affect students? The dominant conceptual framework applied in previous
research involved Word-of-Mouth (WOM)… it is more salient and persuasive than
traditional advertising; this effect can primarily be attributed to users’ heightened trust of
fellow users over an official advertising source (Berger, 2013; King et al., 2014) 457

A large body of established research on message-sidedness demonstrated that two-sided


refutational messages are more persuasive than one-sided messages, while two-sided
nonrefutational messages are less persuasive than one-sided messages (for a meta- analysis,
see Allen, 1991) 459

when instructors respond by establishing their credibility without explicitly refuting the
students’ opinions, this juxtaposition between these two communication processes
functionally results in a two- sided nonrefutational message on RMP. 459

In particular, these strategies focus on instructor credibility and are less likely to elicit
unfavorable or unforeseeable responses from other students and/or instructors. Further, this
type of response acts as an overall statement and does not necessitate addressing every new
student evaluation that arises on RMP. 459

The effect of user-generated content (comments) also applies to videos of public service
announcements (proprietor content) on Youtube.com. 460
Ultimately, source attributes likely affect the extent to which responses achieve the
intended neutralizing effects… one possibility is that students may not socially identify
with instructors and so disregard their responses. Instructor responses that do not directly
seek to counter student evaluations, however, create a two-sided nonrefutatational dynamic
on RMP. 460

Method. Participants (N = 231) were undergraduate students at a small western American


university.

Procedures. To start, all participants viewed a RMP webpage with negative student reviews
(Appendix A). The modified RMP page contained negative overall statistics (i.e., 1.7
overall quality, 2.0 helpfulness, 1.4 clarity, 3.2 easiness… The individual evaluations were
all negative… Then, participants were assigned to one of eight experimental conditions
described in the next section. After viewing the induction, participants viewed a brief video
lecture (4 min and 29 s) related to the healthcare industry from the ostensible instructor. To
control for variability introduced by instructor characteristics, all participants viewed same
lecture video.

Video stimulus. The ostensible instructor was a Caucasian female who appeared to be
approximately middle-aged and lectured on the healthcare industry.

Experimental inductions. Experimental inductions. Participants were randomly assigned to


one of eight experimental conditions, resulting in a 2 (Instructor Statement of Competence:
Present/Absent) × 2 (Instructor Statement of Trustworthiness: Present/Absent) × 2
(Instructor Statement of Caring: Present/Absent) experimental design.

Instructor response. The competence statement included a statement to establish official


teaching expertise. The trustworthiness statement discussed teaching fairness and adherence
to grading criteria, and the caring statement showed concern for student success and the
instructor’s willingness to help students. Appendix B provides the full detail and text.

Measures. five quiz questions assessed participants’ objective cognitive learning. The
multiple-choice quiz contained questions regarding the content presented in the video that
students watched… state motivation measure assessed student motivation to learn from the
instructor using a 12-item 7-point semantic differential scale.

Results. An independent samples t-test showed that the instructor response (Range: 1 to 5)
led to higher cognitive learning (M = 3.28, SD = 1.38) than the no response control (M =
2.57, SD = 1.43), t(229) = 2.52, p = .01, d = .33. This result was consistent with H1a.

However, the instructor response (Range: 1 to 7; M = 4.10, SD = 1.52) did not significantly
lead to increased student motivation to learn compared with the control condition (M =
3.59, SD = 1.35), t(221) = 1.63, p = .10, d = .22. Thus, the data were inconsistent with H1b.

RQ1 explored the effect of various credibility dimensions (i.e., competence,


trustworthiness, and caring) on lower-level cognitive learning.
Only statements of trustworthiness emerged as a significant factor… Specifically,
statements of trustworthiness led to higher quiz scores (M = 3.41, SD = 1.43) than no
statements of trustworthiness (M = 2.99, SD = 1.36).

An original experiment displayed four negative student reviews on RMP to participants


(students) regarding a potential instructor at their university and varied the instructor
response using statements of credibility. The statements aligned with the three credibility
dimensions: competence, trustworthi- ness, and caring. The main findings revealed that the
presence of an instructor response led to increased lower-level cognitive learning in
students, hereafter referred to as cognitive learning. However, the instructor response did
not affect student state motivation to learn. When exploring the effect on cognitive learning
based on the three dimensions of instructor credibility, an interesting pattern emerged: only
statements of trustworthiness increased cognitive learning.

The supplementary findings show that any instructor responses to establish credibility,
regardless of dimension, resulted in a higher likelihood of students reporting that they
would enroll in a future class in the course content. These effects are additive, and not
multiplicative. For the greatest effect, instructors may aim to utilize all three credibility
dimensions when attempting to ameliorate RMP’s effect on enrollment.

Ronald R. Mau & Rose A. Opengart (2012), Comparing Ratings: In-class (paper) vs.
Out of Class (online) Student Evaluations, Higher Education Studies; Vol. 2, No. 3;
2012 ISSN 1925-4741 E-ISSN 1925-475X

This study examines the difference in student evaluations in two contexts; online and
paper-based, in a finance course taught to non-finance majors. The evidence strongly
indicates faculty receives higher evaluations using a paper-based instrument administered
during class than with an online assessment instrument which students complete on their
own time.

The data were collected for a total of 8 semesters, starting in the fall of 2007 when the
online system was first used on campus, through the spring 201l semester. The fall of 2007
assessments used a 5 point Likert scale while all other semesters used a 4 point Likert scale.
Because of the different scales, the fall 2007 data is not used in this study or reported.
However, the results were similar to those reported here. 57

Paper evaluations were given after the online evaluations were available for students to
complete and after the online evaluations had closed. Students were never informed paper
evaluations would be given in addition to the online evaluations. This process was used to
reduce or eliminate lower response rates during the online evaluation window by students,
in the event this knowledge might reduce on-line evaluation participation. 58

of the 8 online response rates were below 43 percent and half of these are below 34 percent.
These response rates are below generally accepted criteria for decision-making (Theall &
Franklin, 1991) and lead to further questions regarding the use of online evaluations when
evaluating faculty. 60
The savings of administering online evaluations potentially results in higher costs to
individual faculty in terms of tenure and promotion decisions. This is potentially more of an
issue if institutions and departments have not adjusted scales to define exceeding, meeting
and not meeting expectations. Another observation from the results in Table 2 is despite
higher response rates for online student assessments in the fall of 2010 and spring of 2011
(70 and 74 percent respectively), paper assessments still resulted in statistically significant
higher overall mean assessment results. 61

The results indicate the ratings are higher for each of the 20 questions when the evaluations
were completed on paper. The higher ratings were statistically significant for 13 of the 20
questions. The largest difference between classroom paper and online evaluations was 0.29
(question 1). 61

The individual question response percentages were aggregated by category as shown in


Table 6 and Figures 7 through 12. Chi-squared test results are reported in Table 6 and
provide additional evidence the paper responses are not the same as the online responses.
The percentage responses result in fewer 3’s (agree) and more 4’s (strongly agree) when
using paper evaluations. 62

The use of online evaluations has gained usage due to the ease and reported cost savings of
an electronic/online assessment instrument. The use of online assessments will likely
continue to grow and become more widespread. Faculty and administrators should be
aware of issues related to how the scores are analyzed (means vs. counts), and based on this
study, the lower assessment scores received by faculty. Guidelines and standards may need
to be adjusted for online evaluations as scores are significantly lower than paper responses.
66

Dennis E. Clayson, What does ratemyprofessors.com actually rate?, Assessment &


Evaluation in Higher Education, 2014, Vol. 39, No. 6, 678–698,
http://dx.doi.org/10.1080/02602938.2013.861384

This research looks closely at claims that ratemyprofessors.com creates a valid measure of
teaching effectiveness because student responses are consistent with a learning model.
While some evidence for this contention was found in three datasets taken from the site, the
majority of the evidence indicates that the instru- ment is biassed by a halo effect, and
creates what most accurately could be called a ‘likeability’ scale.

Some rankings use information from online sources. Forbes, for example, publishes such a
list yearly, and 15% of their matrix for ratings of colleges is taken from
ratemyprofessors.com (Howard 2013).

In general, they report that neutral feelings about an instructor did not motivate students to
post. This tendency to base evaluations on how well an instructor is liked creates what has
been called a ‘likeability’ scale (Delucchi and Pelowski 2000).

Halo effects have been consistently found in SET (Clayson 1989; Feeley 2002; Orsini
1988; Spooren and Mortelmans 2006; Tang and Tang 1987), with some sources reporting
that the effect may account for as much as 69% of the total variance of the evaluations
(Shevlin et al. 2000).

Otto, Sanford and Ross’s findings were replicated with random variables with no
curvilinear characteristics, and their results dis- appeared entirely when the skewness of the
data was removed. Both findings would indicate that Otto, Sanford and Ross’s results,
which were the key to their findings, were statistical artefacts resulting from the study’s
methodology.

Hotness is highly significant in all samples. By looking at the R 2 values for each dependent
variable, easiness is found to be distinctively different from helpfulness and clarity in each
dataset.

In other words, irrespective of whether an instructor was rated easy or hard, the instructor
got essentially the same rating on helpfulness. In the low helpfulness group, as helpfulness
ratings increase, so does easiness (r = 0.320, p < .001).

The literature review suggests that RMP findings can be used as a proxy for SET
instruments, with a caveat that many students might not take the evaluation process
seriously, as manifested by student comments and the importance of the ‘hotness’
measure… The research literature indicates little relationship between SET findings and
objective measures of learning with modern students.

The simulation data from Test 1 also show that randomly created data produced the same
statistical associations found in Otto, Sanford and Ross, suggesting that the curvilinear
relationship between helpfulness and easiness as predicted by a learning hypothesis is a
statistical artefact of the data.

Test 6 indicates that there are halo effects in RMP data and that they are highly associated
with helpfulness, clarity and easiness.

To the degree to which the literature finding of a positive association between the average
RMP scales and institutional SET is valid, this would indicate that students will give higher
evaluations to instructors they judge as being easy.

There is also a suggestion in these findings that, if students like an instructor (for whatever
reason), then the easiness of the class becomes relatively irrelevant.

In general, the findings indicate that RMP data are an invalid measure of teach- ing
effectiveness, if effectiveness is tied to learning.

To say, however, that SET is invalid as a measure of effective teaching does not mean that
the instruments do not validly measure something else. The findings of this study are
compatible with the assertion made by a number of researchers (Clayson and Haley 1990;
Delucchi and Pelowski 2000; Marks 2000; Tang and Tang 1987) that what these
instruments most likely create is a ‘likability’ scale.

You might also like