Professional Documents
Culture Documents
net/publication/315468590
CITATION READS
1 270
3 authors:
Yanjun Pu
Beihang University (BUAA)
12 PUBLICATIONS 14 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
the National Key Research and Development Program of China View project
All content following this page was uploaded by Yong Han on 10 July 2022.
1 Introduction
2 Related Work
In this paper, before allocating tasks, we detailed analysis the student online
performance of learning for accurately estimating the knowledge level of student. We
have referred to the thought task assignment in Crowdsourcing and proposed an
effective method by compare the previous models. We first review some existing
researches for allocating tasks. Allocating tasks is a long historical issue and appears in
many fields such as distributing conference papers or in the parallel systems and so on.
It is also one of the core problems of the crowdsourcing.
In the mode of Crowdsourcing, a task is decomposed into multiple small tasks [8]
which is assigned to massive online workers. Because of the existing of human error
and fraudulent workers, the confident of the collection results which typically contains
a lot of noise is usually very low [2, 3]. In order to ensure the reliability of the results,
the measure of redundant allocation is typically taken, namely, assigning the same task
to multiple workers, and aggregating the multiple results to generate a final one. Here
the task allocation is to find a balance between the reliability of the results and the
redundancy of the task allocation.
Umair and Edward [15] proposed a task allocation solution based on the human
ability. This method first models the abilities of a worker according to historical data.
Similarly, it models the types of workers’ abilities that the task required. Then through
adaptation, the algorithm allocates the task to the most suitable workers thus obtaining
the optimal results with the minimal redundancy. Finally, they adjust the worker model
by comparing the completing results to the aggregating result.
Hyun and Matthew [16] had solved the problem of data sparsity by decomposing
matrix based on the research of Umair and Edward, increasing the universality of the
method. In addition to model the abilities of the workers based on the historical data,
other researchers also considered the demographic factors, education background, and
especially the social information to determine the level of the workers’ abilities which
can allocate the task to the worker accurately.
Furthermore, because the most Crowdsourcing tasks are paid tasks, the capital
budget is constraint [4], David, Sewoong and Devavrat [17] inspired by the algorithm
of propagation of confidence and the algorithm of low rank matrix approximation
designed the minimum cost allocation strategy. The results in the strategy make the cost
of the spending the least if it meets certain prerequisites of reliability.
Yan Yan et al. [18] proposed an online adaptive task assignment algorithm by
ensuring the confidence of the results. The algorithm allocates the subsequent
redundancy and the workers according to the required of the task by analyzing the
completion of the task time to time and the confidence of the results to make the final
results highly reliability.
Piech et al. [6] exploited the confidence estimated to determine what point in the
grading process is confident about each submission’s score. They simulated grading
taking place in rounds. For each round they ran model using the corresponding subset
of grades and counted the number of submissions for which they were over 90%
confident that they predicted grades were within 10pp of the students’ true grade.
In our work, there is no need to classify the students according to the types. The
primary information about students is their historical learning data, thus we exploit
some models compute the student knowledge level based on this data. By extracting
student features, we using the method of logistic regression to fit the features and this
part accounts for the proportion of our work. We also define a hyper-parameter as the
redundancy value, which requires us to constantly experiment.
We have done experiments over different redundancy values and different number
of students. The results can be seen in section 5. Our another attempt is using the
examination scores as the knowledge level, but it does not generate a good effect, the
reasons may be that the score is a monotonous factor.
Fig. 1. Peer Grading Framework.
3 Methods
In this section, we present the overview of the entire peer grading framework and
introduce the design of grading task assignment in detail. Figure 2 illustrates the basic
framework of our peer grading process. It consists of three major components: the
student performance evaluation model, the peer grading task allocator and the score
aggregation model. The student performance evaluation model is responsible for
computing each student’s knowledge level by analyzing each student’s behavior in the
watching course videos and the performance exercises after the videos. The outcome
from the model presents the prior knowledge level of students for the peer grading
process to determine the arrangement of peer grading groups. The peer grading process
includes two steps: grading task allocation and score aggregation model. In this paper,
we introduce the MPLT algorithm to perform task allocation in order to minimize the
difference among the students in each grading group. Based on the task assignment
plan, students can evaluate other peers’ submissions and give scores using their own
judgement. As each submission in general receives multiple scores from its graders, the
score aggregation model needs to infer the accurate and unbiased score for the
submission from the scores recommended by peer graders.
At the beginning of the section, we describe the problem of assignment allocation in
peer grading. And then we present the algorithm of LPT that is used in the field of task
scheduling, and how we can extend the idea LPT for the purpose of peer grading and
introduce the new algorithm MLPT. At the end of the section, we briefly discuss our
research work on student performance evaluation model.
In the process of peer grading, each student plays two roles: a grader and a submitter.
Suppose that there are ℕ students participating peer grading, and there are also ℕ
submissions from these students. For convenience, we suppose that each student needs
to grade ℳ submissions and each submission is graded by ℳ graders, typically the
value of ℳ is between 3 and 6. The description of assignments allocation in peer
grading is shown in Figure 2. Specifically, the students are divided into ℕ groups:
each group contains ℳ students, and each student is included in one group, so each
group corresponds to one submission [1, 8].
The process of improving the allocating submissions can be expressed as the
following problem:
Define the student collection V = {𝑣1 , 𝑣2 , … , 𝑣𝑛 } and the submission collection
U = {𝑢1 , 𝑢2 , … , 𝑢𝑛 }.
N students v1 v2 ……
...
vn
v1 v8 v6 v4
N groups
v3 …… v9
N submissions u1 u2 …… un
The algorithm LPT is a basic method to solve the task scheduling problem in parallel
computing system. The tasks scheduling problem in a parallel system can be described
Table 1. The process of the Algorithm of LPT.
Similarly, there are many similarities between assigning submissions and task
scheduling in parallel systems. We introduce a MLPT algorithm by redefining the
process of LPT towards the needs of grading task assignment. Because for the problem
of assigning submissions, assigning submissions to the graders can be viewed in
another angle as assigning the graders to the submissions, so the student collection V =
{𝑣1 , 𝑣2 , … , 𝑣𝑛 } is considered as the task collection J = {𝑗1 , 𝑗2 , … , 𝑗𝑛 } and the
knowledge level w(𝑣𝑖 ) of each student is the processing time t(𝑗𝑖 ) , then the
submission collection U = {𝑢1 , 𝑢2 , … , 𝑢𝑛 } is considered as the machine collection
M = {𝑚1 , 𝑚2 , … , 𝑚𝑝 }. According to these above assumptions, assigning the graders to
submissions in peer grading is equivalent to allocating the tasks to machines in the
scheduling process, the MLPT algorithm can ensure the sum of the knowledge levels
of the graders in each group is close to any of the others.
4 Datasets
The dataset using in our paper is collective from the course of computer network
experiment abbreviated as CNE) of our MOOCs platform of Beihang Xuetang
(http://mooc.buaa.edu.cn) The MOOC course CNE includes three assignments for the
Sophomore undergraduate who had not studied this course before and we only use
assignment 1 and assignments 2. The course required that the students had to upload
the submissions before the end of each week.
Specially, the session course was open in the year of 2015 and active for a 11-week
semester. The course contains 10 chapters where there are several video units per
chapter and a final examination. After each video unit, there are multiple choice
problems as quiz for students. The number of the problems ranges from 5 to 13 in each
unit. Also, the course contains 4 open-ended design problems that demand the students
to design and implement the experimental programs independently. The score of these
open-ended problems are generated through the peer grading mechanism in our MOOC
platform. There are approximately 600 active students each semester since the course
CNE went online and roughly 500 students to take part in the final exams. The number
of the records collected by the system is 2 million, which logs every user interaction
with the courseware including scanning and clicking, video interaction, posting
questions and comments in the course forum.
After removing the incomplete items in the original dataset, we choose 210 students
including 210 items, and in our setting each student grade 4 submissions and each
submission must be graded by 4 graders. In order to prove the advantage of the MLPT
algorithm, we also produced two simulation datasets imitating the real data. In the
simulation datasets, the number of items is 200 and we valued the redundancy peer
grading number as 3, 4 and 5 in our experiments. The major difference between the real
data and the simulation data is that the prior is generated from the normal distribution
and evenly distribution, respectively.
5 Experimental Results
100.00%
the decline of variance of groups
98.00%
96.00%
94.00%
92.00%
90.00%
88.00%
100 200 300 400 500 600
m=3 m=4 m=5 students
Fig. 3. The degree of declining of variance from MLPT algorithm to the random algorithm.
This section mainly verifies the algorithm of MLPT from both the real dataset and
simulation dataset. For each dataset, we compute the variance of all the groups using
the sum of each group as one item and we have run 10 times for the computing an
average value. In both table 3 and table 4, the simulation dataset 1 is generated from an
evenly distribution and the simulation dataset 2 is generated from a random assignment
distribution.
Table 3. The comparison between random assignments and MLPT based on the simulation
datasets.
Table 4. The comparison between random assignments and MLPT based on the real datasets.
Table 3 shows the results from the simulation datasets, the dataset 1 and dataset 2,
which are generated from the evenly distribution and normal distribution, respectively.
The MLPT algorithm reduces the variances of the knowledge master level among the
groups from 0.381 to the 0.009 on the simulation dataset 1 and from 0.266 to 0.017 on
the simulation dataset 2.
Similarly, Table 4 is the results generated from the real datasets. The value of the
variance is 0.053 after using MLPT in Assignment 1, and the optimization percentage
is 85.0%. The optimization percentage is 87.7% and the MLPT value is decreased from
0.376 down to 0.046 in Assignment 2. Note that the MLPT optimization results on the
simulation datasets seem better than the results on the real dataset. The reason may be
that the prior inferred from the performance of quiz doesn’t completely associated with
the actual knowledge level of students in the open-ended problems.
Through simulation, we can explore different settings such as the various numbers
of students and the different values of the parameter m to compare the performance
between the MLPT algorithm and the random algorithm. In the simulation dataset, we
have verified a series of datasets with the number of students varying from 100 to 600
and the value of parameter m ranging from 3 from 5. And in each simulation, we
compute the decline in the group variance caused by the MLPT algorithm over the
Random algorithm. Figure 3 plots the results generated from the Random and MLPT
algorithms. It demonstrates that the MLPT algorithm can reduce the variance of all the
groups significantly in comparison with the Random algorithm, especially when there
are not many students and the parameter m is relatively small. Figure 4 shows the
comparison results generated from the MLPT and Random algorithm based on the real
datasets in the process of peer grading. The vertical axis indicates the Root Mean
Square Error (RMSE), which is used to compute the accurate of grading scores. In the
research of Fei Mi et al. [11], researchers proposed new score aggregation methods
based on probabilistic graph models to infer accurate scores of student submissions.
These methods outperform the method using the mean aggregation strategy. It is
necessary to investigate whether the MPLPT performance has any impact on different
aggregation methods. The result in Figure 4(a) shows the difference of RMSE between
the MLPT algorithm and the RMSE algorithm based on the mean aggregation strategy
and Figure 4(b) shows the results based on the probabilistic graph model. No matter
which aggregation strategy is adopted, the MLPT algorithm can effectively improve
the overall accuracy of peer grading scores.
Peer grading in MOOC platforms need an effective framework for grading tasks
allocation and score aggregation in order to deliver accurate feedbacks of homework
evaluation to online students. Our paper presents a complete peer grading framework
and introduces a new task scheduling algorithm MLPT for the task assignment problem
of peer grading. This framework can infer the knowledge mastery level of every student
using a two-stage data analysis pipeline with the behavior-based engagement model
and knowledge tracing-based student performance evaluation model Our task
allocation algorithm regards the assessment outcomes of each student in the quizzes of
the MOOC courses as the priors and utilizes the priors to build peer grading groups. It
aims at minimizing the capability difference among grading groups and generating a
fair and accurate evaluation for each student submission. The experimental results from
both simulation datasets and real course datasets demonstrates that the MLPT algorithm
can outperform the conventional random strategy.
There are a number of issues to be addressed in future work. In this paper, we assume
that the redundancy number of peer grading tasks for every submission remains
constant. However, in practice, students may fail to complete their grading tasks in time,
thus resulting in less redundant grading results for some submissions. It would be
interesting to investigate whether this problem affects the performance of the schedule
algorithm. Another issue is whether the errors in the student evaluation model has any
impact on the MLPT algorithm. Given the probability nature of the knowledge tracing
model, it is unavoidable that it may give a biased estimation of knowledge levels of
some students when there are noises in the data of user interactions and question
answering. One of the feasible solutions is to have staffs to cross-examine some peer
grading results and correct the potential errors.
7 Acknowledgments
This work was supported in part by grant from State Key Laboratory of Software
Development Environment (Funding No. SKLSDE-2015ZX-03) and NSFC (Grant No.
61532004).
References