You are on page 1of 8

Selectively Anonymous Rankings: Design, Analysis

and Impact on Computer Science Students


Simon Spacey
Computer Science Department
Waikato University
Hamilton, New Zealand
Email: sspacey@waikato.ac.nz

Abstract—This paper presents a method to deliver course 1) students no longer able to use the rankings to
rankings that maintain student confidentiality while still allowing demonstrate their position to others,
students to selectively prove their position in a class to others if 2) the ivory towers unable to award student prizes without
they wish. The method’s selective anonymity is implemented
through a secure hashing algorithm that is designed to protect breaking the anonymity of the system and with
student privacy even where a student’s name, their student ID, 3) the ranking of small groups jeopardised by the peril of
their project teams and the student’s ranks for other work items name identification through grade correlation
are known. The paper includes results showing student
perceptions of the approach and the impact on their performance and so a quest for a new ranking approach was championed by
for a second year Computer Science course at the University of the ivory towers.
Waikato in New Zealand.
This paper details the result of our quest to find a way to
Keywords—Student Privacy; Grade Rankings; Course Hashes provide student rankings that removes the disadvantages of the
Student ID ranking system itemised above while still providing
students with the protection of rank anonymity. We begin in
Section II by providing an overview of alternate methods that
I. INTRODUCTION deliver various levels of student ranking privacy and examining

O
NCE upon a time, there was a set of students who their benefits and issues. In Section III we then provide
lived in a land without mobile devices, free internet technical details of our approach and show how it addresses all
and social networking sites. Each semester the the issues inherent in the Student ID ranking system and in
students took part in a tournament organised by an ivory tower Section IV we provide results from an initial case study of the
and their names and results were released as rankings on a use of our approach by students on a core Computer Science
large notice board in a public arena for all to see. The rankings course at the University of Waikato.
were respected as being a fair and transparent indicator of
achievement in the field and were far preferred to the approach II. BACKGROUND
of the neighbouring kingdom where every competitor was In this section we review several approaches to delivering
simply given a letter grade of A for “A”ttendance which the student ranks and identify a set of characteristics that we can
people criticised as not rewarding individual effort and hiding use to contrast the different approaches as summarised in
information from future patrons. Table I.
But then one day there came a magical wind of change that The traditional Student Name ranking system simply
connected all the students through mobile devices, free internet provides a list of student names ordered by their position in the
and social networking sites. The change brought great class often with a column of actual mark values. The system
opportunity for the students but also great risks given the has the advantage of being perhaps the most straightforward
traditional named ranking system. In particular some were ranking approach to understand and use and allows students to
concerned that evil mage would misuse the technology to, for quickly identify their own position and prove their position in a
example, anonymously cyberbully students both high and low ranking to others. However Student Name rankings do not
in the rankings and so parliamentary bills were proposed [1] to provide any form of anonymity and have been largely replaced
change the laws of the land [2], [3], [4], [5] to govern the use of by Student ID rankings at universities today.
the new technology and the ivory towers, whose councils were
The Student ID ranking system used as standard in New
explicitly charged with the well-being of their students [6],
introduced guidelines for dealing with the new technology [7] Zealand Computer Science and Engineering departments today
can be seen as a simple change to the Student Name ranking
and changed their approach to the tournaments to publish ranks
system wherein student names are replaced by their student
by student ID rather than by student name.
IDs. Student ID rankings have the inconvenience of an ID
Unfortunately, the Student ID ranking approach cast a lookup not seen in Student Name rankings but provide the
cloud over the land with: benefit of immediate anonymity which ensures third parties can

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 198
TABLE I. CHARACTERISTICS OF DIFFERENT STUDENT RANKING SYSTEMS. FROM LEFT TO RIGHT, THE COLUMNS DESCRIBE WHETHER THE SYSTEM: ALLOWS
STUDENTS TO IDENTIFY THEIR OWN RANK FROM INFORMATION THEY HAVE (KNOW POSITION), PROVE THEIR RANK TO THIRD PARTIES (PROVE POSITION), PROVIDE
ANONYMITY TO THIRD PARTIES WITHOUT OTHER INFORMATION (IMMEDIATE ANYM.), ALLOWS POSITIONS TO BE SELECTIVELY CONFIRMED BY THE RANKING
AUTHORITY WITHOUT DISCLOSING STUDENT RANKS ON OTHER WORK (INDIRECT ANYM.), PREVENTS GROUP INFORMATION BEING USED TO CORRELATE IDENTIFIERS
(CORRELATION ANYM.) AND WHETHER THE SYSTEM ALLOWS STUDENTS TO DENY A RANKING IF A POTENTIAL MATCHING IS DEMONSTRATED BY A THIRD PARTY
(PLAUSIBLE DENI.).

Approach Know Position Prove Position Immediate Anym. Indirect Anym. Correlation Anym. Plausible Deni.
Student Name      
Student ID      
No Rankings      
Unique ID      
Course Hashes      

not identify a student from the ranking without further benefits such as providing plausible deniability even after a
information. Unfortunately, however, immediate anonymity is student discloses a mapping to “prove” they have a particular
permanently removed from future rankings if a student’s position in a ranking as you will see in the following sections.
position in one ranking is disclosed to others for example
through a class prize for top students. Further, Student ID
rankings are vulnerable to correlation analysis attacks wherein III. METHOD
project team members can identify the IDs of other members in In this section we discuss our approach to generating
group results and, like Student Name rankings, Student ID selectively anonymous rankings. We start with our abstract
rankings suffer from a lack of plausible deniability once a Course Hash algorithm in Section III-A before describing and
student’s ID is known. analysing our specific algorithmic implementation choice in
Sections III-B and III-C. The system architecture used to
Of course, the most secure ranking system is one that does
deliver our algorithmic choice is described later in Section IV-
not exist or is kept entirely private. Unfortunately, while being A where we detail our implementation case study.
secure, the No Ranking approach does not allow students to
know their own position in a class and has the potential A. Course Hashes
drawback of preventing students from being able to prove their We define a Course Hash to be a function that maps course
position to third parties which may be required to join some and student information to unique IDs for use in rankings. In
companies [8] given the risk that letter grade assignments may contrast to the Unique ID system, access to the Course Hash
not be representative of ability and are almost certainly not function is provided publicly so students can enter their
comparable between course variants, institutions and even information to prove their position in a class to others thus
years for the same course at the same university. removing the main issue seen in the Unique ID approach as
A secure alternative to not providing public rankings is to illustrated in Table I.
provide rankings with a unique ID for each course work item The student information used in the Course Hash needs to
for each student. This approach has the advantage of allowing uniquely identify a student but also to provide direct
the student to know their own position in the class, provides anonymity. To achieve this we can use student IDs and a
immediate anonymity, correlation anonymity and plausible private key making our general Course Hash function:
deniability and also provides indirect anonymity which allows
prizes to be given for a work item without disclosing the courseHash(courseID, workItemID,
unique ID mapping for other work items to third parties in the studentID, privateKey) (1)
case where the unique IDs are randomly assigned. However the
approach has the disadvantage of requiring students to manage
The keys need to be provided on a per student basis to prevent
multiple IDs and of not providing any way for a student to
enumeration attacks given the low search space of student IDs
prove their position in a ranking without the university
in most university classes. A different student key can be
providing explicit confirmation of a mapping.
supplied for each work item for each student to act as work
The ranking approach we present in this work is called item proof certificates in the hash mapping or the same student
Course Hashes. As Table I shows, Course Hashes provide a key can be used by a student across multiple work items where
means to disclose rankings to students in a way that is just as a trusted Course Hash Generation Form such as the one
private as using unique IDs but has the advantage of allowing presented in Fig. 1 is provided as in our practical
students to demonstrate to a third party that they have a implementation which we discuss further in Section IV-A.
particular ranking position for a work item at any point in the
An example of a Course Hash function would be to use the
future without the need for a long lived central authority to
variable block size Spacey Cypher [9] acting as:
store the mappings for years to come. Unlike the Unique ID
system, Course Hashes have the advantage of not necessarily spaceyCypherbk(courseID.workItemID.studentID,
requiring students to manage multiple keys and have other privateKey) (2)

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 199
to encrypt a block consisting of a concatenation of the course, C. Analysis of the Specific Algorithm
work item and the student IDs (padded to the block size b) with Our Course Hash function of equation 3 has 72 bits of
the private student key. The output would be a block of b-bits output variability thus inputs with any more than 72 bits of
in length that could be represented as a textual Course Hash ID variability must produce output clashes. With this in mind, our
using an encoding scheme such as base64 [10]. private keys need not have an entropy H [16] of more than:
The Course Hash ID generated by equation 2 would not be 72 − H(studentID) (4)
guaranteed to be unique for each student ID because of the
different keys used however the probability of clashes can be bits to obtain a complete coverage of the 72 bit Course Hash ID
controlled through the choice of block size in the Spacey space for a set of students given that the course and work item
Cypher which is not directly possible with fixed block sized IDs of equation 3 are constants for a ranking. Thus for a class
cyphers [11]. Additionally, the fact that any student ID could of say 128 students (7 bits of student ID entropy), a private key
map to any Course Hash ID given a suitable key means that the of only 65 bits would be sufficient to cover the entire Course
approach provides plausible deniability even if a key is Hash ID space (assuming we have a perfect hash function
provided that maps a student to a Course Hash ID which we [17]).
will discuss further after we detail our specific algorithm
In our implementation which we detail in Section IV we
choice next. use a private key space of 72 bits rather than the minimum
B. Specific Algorithm required key space for a set of students. This approach allows a
Unfortunately, variable block size cyphers such as [9] are student ID to map to any Course Hash ID given an
not publicly available and with their unbounded key sizes appropriately selected key and so provides for full plausible
export may be controlled in some jurisdictions [12]. deniability (again assuming we have a perfect hash function).
Consequently for our implementation we selected the Secure The downside of allowing plausible deniability is that we have
Hash Algorithm (SHA1) [13] for our Course Hash function, the possibility for clashes in the output where say student A
which is a 160-bit cryptographic hash function [14] that is with key X matches to the same Course Hash ID as student B
readily available with standard implementations for most with key Y. However due to the Birthday Problem [17], [18]
operating systems. the probability of a clash is only 1 in around 60 bits or 1 in:
580,999,813,345,182,728
There are two problems using SHA1 instead of the Spacey
Cypher as a Course Hash function. They are that: for a class of 128 students given our 72 bit Course Hash IDs
which increases to 1 in around 58 bits (1 in
1) SHA1 does not take a key and
144,680,345,676,153,346) for a class of 256 students meaning
2) SHA1’s output is 160-bits which would lead to forty the chance that we need to generate a new key set because of
hexadecimal or twenty seven base64 character rank IDs. Course Hash ID clashes can be considered negligibly small for
practical class sizes.
We can deal with the first issue by composing the key with the
course, work item and student IDs in the hash block to make a The fact that a student ID could be mapped to a different
key dependent output. There are a variety of standard methods Course Hash ID given an appropriate key introduces a
to compose the key data into a hash functions message block challenge for students namely to find a private key that they
including the keyed-Hash Message Authentication Code can use with their student ID to “prove” to others that they
(HMAC) [15] format, however as our IDs and key can be have a better ranking than they actually do. The chance of
restricted to a fixed length subset of the available characters we finding a better Course Hash ID increases as we go lower in the
use a simple concatenation of the form: rankings as there are more higher Course Hash IDs to match to.
For example, in a class of 128 students, the chance of the
courseID.workItemID:studentID.privateKey bottom student finding a key that maps them to a better Course
in this work with the IDs, “.”, “:” and the key represented in Hash ID is 127 in 72 bits which means an average of around
standard ASCII which results in a fixed input message length 1.4 times 264 full SHA1 computations would be required to find
of 30 characters (240 bits) for our implementation. a higher Course Hash ID1. Performing this number of
computations within a reasonable time would be a challenging
To address the issue of the hash output length we can take High-Performance Computing [19], [20], [21] project for most
advantage of the characteristics of cryptographic hash functions Computer Science or Engineering students.
[14] which make any subset of SHA1’s output bits just as
suitable as any other subset when constructing a smaller sized 1
hash. Given this, we chose to use the first 72 bits of the SHA1 The average computations would be no more than 0.6% less than the figure
quoted (i.e. still well above 264) for an implementation that avoids repeating
result for our implementation making our specific Course Hash the random keys it tries. To sketch the proof of the 0.6% figure, the
function: probability of not finding a matching Course Hash after T non-repeating key
tries is given by     where N is the search space and n is
b64(SHA1(courseID.workItemID: the number of used Course Hash IDs above the student which are 272 and 127
studentID.privateKey)[0..71]) (3) in our example. Thus the non-repeating probability of continuing is bounded
between    and  at step T which allows us to
where [0..71] indicates a bit range and the function b64() is the lower bound the non-repeating tries required to obtain a probability of 0.5
standard base64 encoding [10] which generates twelve base64 using high scale logarithms with bc -l (to avoid the issue of calculating large
character rank IDs for 72 bits of input. factorials).

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 200
IV. RESULTS cat keys.csv | \
perl -e 'while(($id, $key)=(<> =~ \
To evaluate Course Hashes we used the approach to /ˆ([ˆ, ]+), ([ˆ, \n]+)$/)) {\
provide rankings for a compulsory second year Computer @hash=`echo \
Systems course at the University of Waikato with course ID "COMP200.L:$id.$key\\c" | \
COMP200. The course had 115 enrolled students of which 104 openssl sha1 -binary | \
were male and 11 female and 99 were either Citizens or base64 -b12`; \
print "$id, $key, $hash[0]"; \
Permanent Residents of New Zealand and 16 were }' > course_hash_ids.csv
international students. In this section we begin by detailing the
system architecture we provided students to generate their with the input file ./keys.csv the student ID to private key
Course Hash IDs before providing anecdotal and survey results mapping file generated by the previous code and the
to confirm the student opinion of our efforts in Section IV-B “COMP200.L” text on the fifth line replaced by the courseID
and examining the impact the approach had on student and workItemID we needed the Course Hash IDs for. We then
performance in Section IV-C. simply used the Course Hash IDs from the third column of the
output in place of student IDs in our public ranking reports.
A. System Architecture
For our case study we provided students with a key to In our case study, there were no Course Hash ID clashes for
generate unique Course Hash IDs for each of the ranked work any of the four rankings as one would expect given the low
items of the course which were a test, labs, an exam and the probability of collisions discussed in Section III-C. However, if
overall results. We decided to use the same student keys across a clash were to be seen, it would be a simple matter to generate
the four work items to help make it clear to students the new private keys and Course Hash IDs using the scripts above
difference between Course Hashes and Unique IDs and to before distributing the keys to students for a course.
reduce key management issues in this first case study and we To allow students to identify their position in the rankings
present the students opinion of this implementation decision they needed to know their Course Hash IDs for each ranked
through user survey results in Section IV-B. work item. Rather than publishing these to students privately
To ensure the private student keys were randomly selected on Moodle we provided the public Course Hash ID Generation
we generated our 72-bit keys using the OSX secure random Form shown in Fig. 1. This approach allowed the students to
number entropy pool with the command line: use a single form to find their own Course Hash ID and to
prove they possessed a key that maps their student ID to a
base64 -i /dev/random -b12 | \ specific Course Hash ID when demonstrating their position in
perl -e 'open(ID, "<./ids.txt"); \ the rankings to third parties.
while(<ID>) { \
chomp; $key=<>; \
print "$_, $key"; \
}' > keys.csv
where the input file ./ids.txt held the set of student IDs for the
class.
The private keys were distributed to the students using
Moodle [22] by creating a new extra credit grade item called
“Private Student Key” to store the keys as text results that were
only visible to the intended students. To get around limitations
in the Text Grade Type of Moodle 2.5 [23] we had to create a
Custom Scale for our Private Student Key grade item and load
all the keys into that scale (along with a first invalid key of “---
---------” to prevent the first text key being disclosed with the
Moodle Scale Range Display feature) to specify the set of
possible text grades/keys that could be assigned to students in
this class. Following the configuration of the new grade item,
we simply uploaded the keys.csv file to initialise the new
Private Student Key grade item using the standard Moodle
CSV grade import process.
To allow us to generate rankings by Course Hash ID we
needed to generate the Course Hash ID for each work item for Fig. 1. The Course Hash ID Generation Form for the COMP200 course at
each student. We did that automatically using a command line Waikato University [24]. Pressing the “Generate” button after entering a
like the one below on OSX: student ID and key returns the Course Hash ID for the selected work item.

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 201
courseID.workItemID:studentID.privateKey
Hash
Course
Calculation
Web Page b64(SHA1(data)) Service

Student
Fig. 2. Illustration of our Course Hash implementation architecture using the SaSe Secure Hashing Service [25] to perform hash calculations. The response from
the SaSe service is a base64 encoded SHA1 of the passed data string from which the first 12 characters are shown as the 72-bit Course Hash ID by the course
web page JavaScript code.

The form was made accessible on a trusted website for the We introduced the idea of Course Hashes to students on the
course hosted on the university web servers [24] and COMP200 course through a Moodle notice explaining that
implemented our Course Hash algorithm by simply calling the rankings could not be given for the course by student ID if we
SaSe Secure Hashing Service [25] over a standard REST- were also going to give prizes to the top students and so we
JSON [26] [27] client connection as illustrated in Fig. 2. Note were introducing an alternative ranking system. Technical
that as we only use student IDs (which are traditionally details of the Course Hash approach were not given and we did
published and thus considered non-private in Student ID not explain the alternate approaches considered like Unique
rankings), our generated keys and work item codes in our IDs in the e-mail. We followed the initial notice up after
Course Hash algorithm, we are not sending any personal finalising our Course Hash implementation with a notice
information such as e-mail addresses or names to the hashing explaining how students could get their private keys from the
service. Having said that, it should be noted that the case study pseudo grade item in Moodle and provided a link to the Course
architecture is not immune to Pharming attacks and so trusted Hash Generation Form of Fig. 1 which has some motivation
networks or clients secured with techniques such as [28] are and usage information above the entry fields.
assumed when verifying a student Course Hash ID mapping
and in any case, the form of Fig. 1 makes students and third To assess the ease with which students grasped the concept
of Course Hashes we decided to only release the grades of our
parties aware that official verification is still required in order
for ranking positions to be relied on. first returned assessment item (a mid-semester test) through a
Course Hash ranking and held off releasing the scores through
B. Student Reception the Moodle standard grade system. To recap the composition of
the case study, there were 115 students enrolled on this core
To this point we have considered laws and bills currently second year course of varying levels of prior computer
before parliament [1], [2], [3], [4], [5], [6], hybrid cyphers [9], knowledge and ability but despite this, there was only one
cryptographic hashing algorithms [13], the Birthday problem request for support after we released the first Course Hash
[17], [18], High-Performance Computing [19], [20], [21] and rankings on Moodle. In that request two local students
provided formal probabilistic bounds on clashes with repeating questioned how to get their Course Hash ID from the form and
and non-repeating key selections over large search spaces. one of those students questioned why they had to use Course
While this level of complication was necessary to motivate the Hashes in the first place. Before the support staff had a chance
need for Course Hashes and analyse the security of our to respond, both local and international students replied to the
algorithmic choice in a robust manner, it should be recognised post explaining that the key required for the hash generation
that users only need to be able to enter their student ID and key form was the Private Student Key from Moodle rather than say
in the form of Fig. 1 to be able to actually use Course Hashes. the student’s Linux password and detailed the problem with the
However, this “black box” usage simplicity does not mean that traditional Student ID ranking system explaining how they
our case study implementation can be considered user friendly understood and appreciated the anonymity afforded to them by
in requiring, as it does, private key lookups in Moodle, typing the new Course Hash approach. Given the student responses
of keys and student IDs into a public hash generation web page staff did not need to respond to the support post and Course
and then a final grade lookup in a separate ranking PDF Hashes were accepted by all as the primary form of grade
document. Despite this lack of integration, the students adopted distribution for the course from that point on.
the implementation without significant issue and were mostly
enthusiastic about the new ranking approach as we demonstrate Following the course we provided the students with an
with anecdotal and survey evidence in the remainder of this opportunity to provide feedback on their Course Hash
section. experience. We created a survey adhering to Total Survey

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 202
TABLE II. SENTIMENT CLASSIFICATION RESULTS FROM THE COURSE HASH USER SURVEY. THE SURVEY ASKED STUDENTS TO RANK THEIR AGREEMENT WITH
THE STATEMENTS BELOW USING THE STANDARD SURVEYMONKEY SENTIMENT LABELS OF: STRONGLY DISAGREE (SD), DISAGREE (D), NEITHER DISAGREE NOR
AGREE (NAND), AGREE (A) AND STRONGLY AGREE (SA). THE MODE COLUMN GIVES THE MOST COMMON RESPONSE TO EACH QUESTION WITH A “-” USED TO
INDICATE THE NEITHER DISAGREE NOR AGREE (NAND) RESPONSE. THE FIGURES IN BRACKETS ARE THOSE FOR STUDENTS WHO ENTERED A VALID FEEDBACK ID
AND THE FIGURES OUTSIDE BRACKETS ARE FOR ALL RESPONSES.

Survey Question Mode SD D NAND A SA


Valued the Attempt to Protect Privacy Agree 2 (1) 1 (1) 4 (3) 11 (7) 8 (4)

Interested in Additional Technical Information - 2 (1) 7 (4) 8 (5) 6 (4) 3 (2)

Comfortable Using the Course Hash Generator Agree 5 (2) 6 (2) 2 (1) 9 (8) 4 (3)

Would Prefer Per Work Item Keys - 7 (3) 7 (3) 8 (7) 4 (3) 0 (0)

SA
SD SD SA SD SA
D A
SA SD
NAND D A
D A
NAND
D
A NAND
NAND
Valued More Information Usable Work Item Keys
Fig. 3. Pie charts corresponding to the total survey result figures of Table II for the questions (from left to right) of whether the students: valued the attempt to
protect their ranking privacy, were interested in additional technical details, were comfortable using the Course Hash Generation Form and whether they would
prefer individual work item keys to a single key. The segment labels correspond to the SurveyMonkey classifications of: Strongly Disagree (SD), Disagree (D),
Neither Disagree Nor Agree (NAND), Agree (A) and Strongly Agree (SA).

Quality (TSQ) design principles [29] to try to assess the level The survey invitation e-mail was constructed according to
to which the students: the Web Survey Invitation Design principles of [31] and e-
mailed to the student’s with two reminders sent over a 1 week
1) valued our attempt to protect their ranking privacy, period. No prize or inducements were given to complete the
2) would be interested in more details on the approach, survey and in total 26 students responded (22.6%) which is
3) were comfortable using the ID Generation Form and consistent with the student response rates seen in [31].
4) would prefer per work item private keys instead of the
single Private Student Key provided. Table II and Figure 3 summarise the survey results. It can
be seen that the responses of the group of students who elected
as well as to collect general feedback on the good and bad to enter their Feedback ID (whose figures are quoted separately
points of the student’s initial Course Hash experience and any in brackets) is not inconsistent with the responses of the other
specific suggestions for improving the implementation and the students in the table who just answered the survey quickly.
assessment questionnaire itself (to identify any TSQ usability
issues). To satisfy the timelines, accessibility, col- lection The survey used labelled sentiment fields to collect
completeness and accuracy dimensions of the TSQ responses, which makes creating an average view difficult
requirements and to reduce measurement, item non-response without a set of weights that arguably would need to be
and processing errors [29] we used SurveyMonkey [30] with calibrated on an individual respondent basis. However, taking a
ranged radio button and free text entry boxes to collect and cue from SurveyMonkey where the five sentiments are given a
collate survey responses. To satisfy the credibility and linear weighting scale, if we weight the sentiments with the
comparability dimensions of TSQ we provided an optional integers from -2 to 2 with negative weights for disagreement
additional text field on the survey form for students to enter a and positive weights for agreement, we can summarise the
“Feedback” Course Hash ID which they could generate using views of Table II as that the students as a group were quite
the same Private Student Key and form they used to generate positive about our attempt to protect their ranking positions
the Course Hash IDs for the standard work items of the from discovery by other students and people outside the class
COMP200 course [24]. The additional Feedback ID allowed us (+0.85 average), had mixed views about the need for additional
to identify confirmed unique survey responses from actual information and on the ease of use of the implementation
COMP200 course students without having to provide student (+0.04 average for both) and did not think additional work item
personal e-mail addresses or connect student IDs with student keys would be of any benefit given the current implementation
views on the externally administrated survey website. (-0.65 average). This interpretation of the average student view

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 203
TABLE III. CHANGE IN GRADES AFTER INTRODUCING COURSE HASHES HALF WAY THROUGH THE LABS AND AFTER THE MID-SEMESTER TEST OF THE COURSE.
THE STUDENT ID RANKING COLUMNS REFER TO THE PREVIOUS YEAR WHERE ONLY STUDENT ID RANKINGS WERE USED AND ARE PROVIDED FOR COMPARISON.
THE   FIGURES SHOW THE CHANGE IN THE AVERAGE GRADE AND THE

FIGURES SHOW THE CHANGE IN GRADE STANDARD DEVIATION OF THE WORK
ITEMS COMPARED IN ROWS. THE P-VALUE COLUMN GIVES THE PAIRED TWO TAILED STUDENT T-TEST FOR THE UNDERLYING RESULT DATA WITH THE NULL
HYPOTHESIS BEING THAT THE GRADES FOR THE RELATED WORK ITEMS ARE FROM THE SAME DISTRIBUTION. THE PAIRED SAMPLES CONSISTED OF 108 STUDENTS
FOR THE STUDENT ID RANKING COLUMNS AND 93 STUDENTS FOR THE COURSE HASH RANKING COLUMNS AFTER PRUNING OF INCOMPLETE SAMPLES
CORRESPONDING TO STUDENTS THAT DID NOT SUBMIT ONE OR MORE WORK ITEMS. SIMILAR FIGURES WERE SEEN FOR MALES, FEMALES, LOCAL AND
INTERNATIONAL STUDENTS.

Student ID Rankings Course Hash Rankings


         p-value          p-value
Final Exam / Mid-Semester Test 0.91 (1.24) 3% 1.43 (1.17) 0%

Second Half / First of Labs 1.01 (1.04) 73% 0.94 (1.18) 9%

was supported by the students’ textual feedback comments V. CONCLUSION


where most students confirmed they appreciated the attempt to This paper introduced the idea of Course Hashes as a way
protect their ranking privacy but some explained that they to deliver selective anonymity in student course work rankings.
wished the implementation provided tighter integration The need for selective anonymity was justified through the
between their private keys, the Course Hash Generator Form consideration of laws and bills currently before parliament
and the ranking lists and it remains for future work to consider concerned with protecting student and personal privacy in the
user experience enhancements as discussed in Section V. face of new technology [1], [2], [3], [4], [5], [6] and the
C. Impact on Student Performance understanding that the current Student ID ranking system offers
In an attempt to understand the impact of Course Hashes on little protection for classes where performance prizes are given
student performance we had to first obtain a baseline for the or where students work in groups over any period.
students this year with this lecturer, support staff and course Course Hashes are a cryptographic solution to privacy that,
material. To enable this we elected to introduce the idea of in effect, provides students with a key they can use to certify
Course Hashes after the mid-semester test for the course their position in a ranking to others. To analyse the security of
(before the test results were returned) and half way through the our Course Hash algorithmic choice it was necessary to look at
labs. This allowed us to compare the student grade distributions hybrid cyphers [9], cryptographic hashing algorithms [13], the
for the mid-semester test with those of the final exam and for Birthday problem [17], [18] and formal probabilistic bounds on
us to identify any changes in lab performance after the clashes with repeating and non-repeating key selections over
introduction of Course Hashes. large search spaces. However, from a users perspective, our
Table III shows the change in grades between the implementation consists of a simple web form [24] that maps a
assessment items of the first and second half of the Computer student ID to a Course Hash ID with the click of a button and
Systems course for a year when just Student ID rankings were this “black box” simplicity allowed Course Hashes to be
used and the year where Course Hashes were introduced for the introduced with positive reception on a course of 115
second half. The table shows that the exam grades for students Computer Science students with minimal explanation or
increased significantly over the mid-semester test grades support in our case study.
following the introduction of Course Hashes, a reversal of the In Section IV we provided evidence of the impact of
position in the Student ID only ranking year where the exam to Course Hashes from our case study and showed that after we
test grade distribution was also wider. In contrast to the exam introduced Course Hashes our student exam results increased
grades, the grades for the labs were slightly decreased after the and variance ratios decreased in stark contrast to previous
introduction of Course Hashes although not statistically years. However, while our results showed improvement in a
significantly so with a p-value of 9%. If we were prepared to statistically significant manor, we recognise that further
accept the drop in lab performance as significant, the decrease research will be required to confirm the impact of Course
could be attributed to students deciding to work more Hashes given the year to year variability in exams, tests and
independently on the second half of the labs after Course other items of courses at universities.
Hashes were introduced given the additional rank protection
Course Hashes provides for students to learn and make The Course Hash approach we implemented has the
mistakes without being singled out and it is noteworthy that advantage of being a publicly disclosable algorithm that does
this interpretation is supported by the increase in standard not need a long-lived database of scores, access to a login
deviation for the second half lab results over the same period in server or significant development work to implement. However
the previous year of the course. However, while the foregoing the student survey feedback discussed in Section IV-B
changes may seem significant, it should be noted that the indicates that additional work may be required to improve user
course exam, test and, clearly, the students themselves experience in future implementations potentially at the cost of
necessarily changed between the two years of the course one or more of the current implementation advantages. For
considered and further research will be required before any example it is easy to see how user experience could be
causality can be reliably attributed to the use of Course Hashes. improved by providing a parameterised URL to initialise the

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 204
Course Hash Generator Form instead of requiring a Moodle [14] P. Rogaway and T. Shrimpton, “Cryptographic hash-function basics:
key lookup step, by linking the Course Hash Generator Form Definitions, implications, and separations for preimage resistance,
second-preimage resistance and collision resistance,” in Proc. Fast
with a rank database to remove the PDF lookup step or even by Software Encryption (FSE), vol. 3017, 2004, pp. 371–388.
effectively reversing the Course Hash process by providing a [15] The Keyed-Hash Message Authentication Code (HMAC), National Insti-
trusted website with log-in server integration that generates tute of Standards and Technology (NIST) Std. FIPS PUB 198-1, 2008.
position certifying URLs on request for third parties. While [16] C. Shannon, “A mathematical theory of communication,” SIGMOBILE
these user experience improvements may require more Mob. Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55, 2001.
resources than the case study implementation, they are no more [17] X. Wang, Y. Yin, and H. Yu, “Finding collisions in the full SHA-1,” in
complex than a modern on-line store or social networking site Proc. Advances in Cryptology (Crypto). Springer, 2005, pp. 17–36.
to implement and it is hoped that this paper will help highlight [18] B. Schneier, Applied Cryptography: Protocols, Algorithms, and Source
the issue of privacy in Student ID rankings, demonstrate that Code in C. Wiley, 1996.
students value efforts to improve their ranking privacy and [19] S. Spacey, W. Wiesemann, D. Kuhn, and W. Luk, “Robust software
focus efforts to make selective anonymity a standard feature in partitioning with multiple instantiation,” INFORMS Journal on
Computing, vol. 24, no. 3, pp. 500–515, 2012.
the student rankings of the future.
[20] S. Spacey, W. Luk, P. Kelly, and D. Kuhn, “Improving communication
ACKNOWLEDGEMENTS latency with the Write-Only Architecture,” Journal of Parallel and
Distributed Computing, vol. 72, no. 12, pp. 1617–1627, 2012.
The author would like to thank the students of COMP200 [21] S. Spacey, W. Luk, D. Kuhn, and P. Kelly, “Parallel partitioning for
who took part in the Course Hash trial, SaSe Business distributed systems using sequential assignment,” Journal of Parallel
Solutions for the use of their SSP services and the anonymous and Distributed Computing, vol. 73, no. 2, pp. 207–219, 2013.
reviewers and conference organisers for their feedback and [22] Moodle Home Page. Moodle Pty Ltd. [Online]. Available:
support. http://www.moodle.com
[23] “supplied grade is invalid” error when importing text grades from csv
REFERENCES file. Bug Tracker. Moodle Pty Ltd. [Online]. Available:
https://tracker.moodle.org/browse/MDL-35574
[1] Harmful Digital Communications Bill, New Zealand Government, 2014.
[24] “Course Hash ID Generation Form for COMP200,” The University of
[2] Harassment Act, New Zealand Government, 1997. Waikato, 2014. [Online]. Available:
[3] Privacy Act, New Zealand Government, 1993. http://cs.waikato.ac.nz/sspacey/teaching/COMP200/#course hash
[4] Human Rights Act, New Zealand Government, 1993. [25] “SaSe Server Pages (SSP),” Product Information Sheet, SaSe Business
[5] Crimes Act, New Zealand Government, 1961. Solutions, 2012.
[6] Education Act, New Zealand Government, 1989. [26] R. Fielding, “Architectural styles and the design of network-based
software architectures,” Ph.D. dissertation, University of California,
[7] “Cyberbullying information and advice for teachers and principals,” Irvine, 2000.
New Zealand Ministry of Education Sponsored White Paper, NetSafe.
[27] The JavaScript Object Notation (JSON) Data Interchange Format,
[8] General Questions. SaSe Business Solutions. [Online]. Available: Internet Engineering Task Force (IETF) Std. RFC 7159, 2014.
http://www.sase.biz/faq.html
[28] S. Spacey, “Site Continuity Management (SCM),” D.CSC. thesis,
[9] S. Spacey, “The Spacey Cypher,” UK Patent GB2 379 587B, 10, 2001. Cambridge University, Cambridge, UK, 2007.
[10] The Base16, Base32, and Base64 Data Encodings, Internet Engineering [29] P. Biemer, “Total Survey Error: Design, implementation, and
Task Force (IETF) Std. RFC 4648, 2006. evaluation,” Public Opinion Quarterly, vol. 74, no. 5, pp. 817–848,
[11] Advanced Encryption Standard (AES), National Institute of Standards 2010.
and Technology (NIST) Std. FIPS PUB 197, 2001. [30] SurveyMonkey Home Page. SurveyMonkey Inc. [Online]. Available:
[12] Encryption Export Controls, vol. 75, no. 122, Bureau of Industry and http://www.surveymonkey.com
Security, 2010. [31] M. Kaplowitz, F. Lupi, M. Couper, and L. Thorp, “The effect of
[13] Secure Hash Standard (SHS), National Institute of Standards and invitation design on web survey response rates,” Social Science
Technology (NIST) Std. FIPS PUB 180-4, 2012. Computer Review, vol. 30, no. 3, pp. 339–349, 2012.

978-1-4799-7672-0/14/$31.00 ©2014 IEEE 08-10 December 2014, Wellington, New Zealand


2014 International Conference of Teaching, Assessment and Learning (TALE)
Page 205

You might also like