Professional Documents
Culture Documents
THINK ALOUD
The method can give insights into the participant’s cognitive process and a general
undertraining of behavioral patterns, this can be used to discover flaws in a product
(Krahmer and Ummelen, 2004). Concurrent think aloud (CTA) is the most valuable
usability method according to Nielsen, (1993) in the method participants are asked to
verbalize their thoughts while performing predefined tasks using the target product
(Krahmer and Ummelen, 2004; Peute et al., 2015). The method can give insights into the
participant’s cognitive process and a general undertraining of behavioral patterns, this can
be used to discover flaws in a product (Krahmer and Ummelen, 2004).
Participants often find it unnatural to think aloud, therefore, it’s good to have warm-up
tasks to get participants comfortable (Krahmer and Ummelen, 2004). During warm-up, the
moderator can assist the participants, so they think aloud, without interrupting participants
flow and performance on the real tasks. Sometimes think aloud methods are not moderated
enough which can result in participants just pointing out what they are doing. But there is a
difference with just verbalizing what you’re doing and actually verbalizing your thoughts for
example (now I’m pressing here on this button… vs. since the icon on this button symbolizes start
I press on it…).
CTA has some downsides; it can affect efficiency and effectiveness, participants might
reflect on objects differently and be extra conscious of their behavior while thinking aloud.
Participants can also become aware of their behavior if the moderator is asking questions
about participants thoughts, making participants rethink or reflect on something differently
(Krahmer and Ummelen, 2004).
A variation of the CTA is retrospective think aloud (RTA), where the participant performs
the task or tasks in silence. After each or all tasks the participant watch a recording of
himself performing the tasks and is asked to verbalizing his thoughts, this is done together
with the moderator. Using RTA, the participant's behavior during tasks are not influenced.
A problem whit this approach is participants might not remember what they were thinking
in all situations, another risk is that participants rationalized there thought-process in
retrospect. Also, RTA is at least twice as time-consuming compared to CTA (if performed
on all tasks) since participants first perform tasks and afterward view their behavior
(Holmqvist et al, 2011; Peute et al, 2015).
When comparing the two methods in usability evaluations of a healthcare application Peute
et al., (2015) recommend the use of the concurrent think-aloud method over the
retrospective. The result differences form the two methods is not overwhelming, but some
differences in the type of issues are discovered. The CTA resulted in finding more precise
usability issues for example navigation and graphical/symbols and RTA resulted in more
findings of general improvements and terminology (Peute et al, 2015).
RUBIN CHISNELL
Still another reason for a less formal approach concerns sample size.
To achieve generalizable results for a given target population, one’s
sample size is dependent on knowledge of certain information about
that population, which is often lacking (and sometimes the precise reason
for the test). Lacking such information, one may need to test 10 to
12 participants per condition to be on the safe side, a factor that might
require one to test 40 or more participants to ensure statistically significant
results.
https://navelmangelep.wordpress.com/2012/02/27/metode-penelitan-eksperimen/
Secara garis besar dapat kita simpulkan karakteristik penelitian eksperimen adalah antara lain :
1. Menggunakan kelompok kontrol sebagai garis dasar untuk dibandingkan dengan kelompok yang
dikenai perlakuan eksperimental.
2. Menggunakan sedikitnya dua kelompok
3. Harus mempertimbangkan kesahihan ke dalam (internal validity).
4. Harus mempertimbangkan kesahihan keluar (external validity).
CONCURRENT VS RETROSPECTIVE
During concurrent thinking aloud the most frequent
verbalization categories were description of action and
evaluation of the results of action. During retrospective
thinking aloud the most frequent verbalization
categories were indications of problems with the system
and of the user experience caused by the system
AGE RANGE
Older people have relatively slower perceptual learning than younger ones (Gilbert 1996). This input could be
factored in the designing materials for audiences of varying ages. Mungania (2003), in his description of the
e-learning users, asserted that middle-aged people account for the great part of the educational approach’s audience,
with 80% of the polled respondents belonging to the lower that 45 years age bracket.
TIME FOR USABILITY TESTING
The typical user test is 60–90 minutes. After that, users get tired, and it's difficult to run
usability sessions that last more than two hours. (It's possible to run multi-day studies, but
doing so requires a different technique and happens rarely in my experience.)
The half hour rule is just something I learned at school. You need at least half an hour to
conduct a test, but user's concentration decreases rapidly after those 30 minutes.
Sessions: You will want to describe the sessions, the length of the sessions (typically one
hour to 90 minutes). When scheduling participants, remember to leave time, usually 30
minutes, between session to reset the environment, to briefly review the session with
observer(s) and to allow a cushion for sessions that might end a little late or participants who
might arrive a little late
Menurut Prof. Dr. Sugiyono dalam bukunya “Metode Penelitian Pendidikan” tahun 2010, beliau membagi
desain penelitian ekperimen kedalam 3 bentuk yakni pre-experimental design, true experimental design,
dan quasy experimental design.
1. Pre-experimental design
Desain ini dikatakan sebagai pre-experimental design karena belum merupakan eksperimen sungguh-
sungguh karena masih terdapat variabel luar yang ikut berpengaruh terhadap terbentuknya variabel
dependen. Rancangan ini berguna untuk mendapatkan informasi awal terhadap pertanyaan yang ada
dalam penelitian. Bentuk Pre- Experimental Designs ini ada beberapa macam antara lain :
a. One – Shoot Case Study (Studi Kasus Satu Tembakan)
Dimana dalam desain penelitian ini terdapat suatu kelompok diberi treatment (perlakuan) dan selanjutnya
diobservasi hasilnya (treatment adalah sebagai variabel independen dan hasil adalah sebagai variabel
dependen). Dalam eksperimen ini subjek disajikan dengan beberapa jenis perlakuan lalu diukur hasilnya.
b. One – Group Pretest-Posttest Design (Satu Kelompok Prates-Postes)
Kalau pada desain “a” tidak ada pretest, maka pada desain ini terdapat pretest sebelum diberi perlakuan.
Dengan demikian hasil perlakuan dapat diketahui lebih akurat, karena dapat membandingkan dengan
keadaan sebelum diberi perlakuan.
c. Intact-Group Comparison
Pada desain ini terdapat satu kelompok yang digunakan untuk penelitian, tetapi dibagi dua yaitu;
setengah kelompok untuk eksperimen (yang diberi perlakuan) dan setengah untuk kelompok kontrol
(yang tidak diberi perlakuan).
2. True Experimental Design
Dikatakan true experimental (eksperimen yang sebenarnya/betul-betul) karena dalam desain ini peneliti
dapat mengontrol semua variabel luar yang mempengaruhi jalannya eksperimen. Dengan demikian
validitas internal (kualitas pelaksanaan rancangan penelitian) dapat menjadi tinggi. Ciri utama dari true
experimental adalah bahwa, sampel yang digunakan untuk eksperimen maupun sebagai kelompok
kontrol diambil secara random (acak) dari populasi tertentu. Jadi cirinya adalah adanya kelompok kontrol
dan sampel yang dipilih secara random. Desain true experimental terbagi atas :
a. Posstest-Only Control Design
Dalam desain ini terdapat dua kelompok yang masing-masing dipilih secara random (R). Kelompok
pertama diberi perlakuan (X) dan kelompok lain tidak. Kelompok yang diberi perlakuan disebut kelompok
eksperimen dan kelompok yang tidak diberi perlakuan disebut kelompok kontrol.
b. Pretest-Posttest Control Group Design.
Dalam desain ini terdapat dua kelompok yang dipilih secara acak/random, kemudian diberi pretest untuk
mengetahui keadaan awal adakah perbedaan antara kelompok eksperimen dan kelompok kontrol.
c. The Solomon Four-Group Design.
Dalam desain ini, dimana salah satu dari empat kelompok dipilih secara random. Dua kelompok diberi
pratest dan dua kelompok tidak. Kemudian satu dari kelompok pratest dan satu dari kelompok nonpratest
diberi perlakuan eksperimen, setelah itu keempat kelompok ini diberi posttest.
3. Quasi Experimental Design
Bentuk desain eksperimen ini merupakan pengembangan dari true experimental design, yang sulit
dilaksanakan. Desain ini mempunyai kelompok kontrol, tetapi tidak dapat berfungsi sepenuhnya untuk
mengontrol variabel-variabel luar yang mempengaruhi pelaksanaan eksperimen. Walaupun demikian,
desain ini lebih baik dari pre-experimental design. Quasi Experimental Design digunakan karena pada
kenyataannya sulit medapatkan kelompok kontrol yang digunakan untuk penelitian.
Dalam suatu kegiatan administrasi atau manajemen misalnya, sering tidak mungkin menggunakan
sebagian para karyawannya untuk eksperimen dan sebagian tidak. Sebagian menggunakan prosedur
kerja baru yang lain tidak. Oleh karena itu, untuk mengatasi kesulitan dalam menentukan kelompok
kontrol dalam penelitian, maka dikembangkan desain Quasi Experimental. Desain eksperimen model ini
diantarnya sebagai berikut:
a. Time Series Design
Dalam desain ini kelompok yang digunakan untuk penelitian tidak dapat dipilih secara random. Sebelum
diberi perlakuan, kelompok diberi pretest sampai empat kali dengan maksud untuk mengetahui kestabilan
dan kejelasan keadaan kelompok sebelum diberi perlakuan. Bila hasil pretest selama empat kali ternyata
nilainya berbeda-beda, berarti kelompok tersebut keadaannya labil, tidak menentu, dan tidak konsisten.
Setelah kestabilan keadaan kelompok dapay diketahui dengan jelas, maka baru diberi
treatment/perlakuan. Desain penelitian ini hanya menggunakan satu kelompok saja, sehingga tidak
memerlukan kelompok kontrol.
b. Nonequivalent Control Group Design
Desain ini hampir sama dengan pretest-posttest control group design, hanya pada desain ini kelompok
eksperimen maupun kelompok kontrol tidak dipilih secara random. Dalam desain ini, baik kelompok
eksperimental maupun kelompok kontrol dibandingkan, kendati kelompok tersebut dipilih dan
ditempatkan tanpa melalui random. Dua kelompok yang ada diberi pretes, kemudian diberikan perlakuan,
dan terakhir diberikan postes.
c. Conterbalanced Design
Desain ini semua kelompok menerima semua perlakuan, hanya dalam urutan perlakuan yang berbeda-
beda, dan dilakukan secara random.
4. Factorial Design
Desain Faktorial selalu melibatkan dua atau lebih variabel bebas (sekurang-kurangnya satu yang
dimanipulasi). Desain faktorial secara mendasar menghasilkan ketelitian desain true-eksperimental dan
membolehkan penyelidikan terhadap dua atau lebih variabel, secara individual dan dalam interaksi satu
sama lain. Tujuan dari desain ini adalah untuk menentukan apakah efek suatu variabel eksperimental
dapat digeneralisasikan lewat semua level dari suatu variabel kontrol atau apakah efek suatu variabel
eksperimen tersebut khusus untuk level khusus dari variabel kontrol, selain itu juga dapat digunakan
untuk menunjukkan hubungan yang tidak dapat dilakukan oleh desain eksperimental variabel tunggal.
DAFTAR PUSTAKA
Danim, S. 2002. Menjadi Peneliti Kualitatif. Bandung: Pustaka Setia.
Faisal, S. 1982. Metodologi Penelitian Pendidikan.Surabaya: Usaha Nasional
Fuchan, A. 2004. Pengantar Penelitian dalam Pendidikan. Yogyakarta: Pustaka Pelajar
Solso, R. L MacLin, M. K, O. H. (2005). Cognitive Psychologi. New York. Pearson
Sugiyono, Dr. 2010. Metode penelitian Kuantitatif Kualitatif dan R&D, Penerbit Alfabeta
Sukardi, 2003. Metodologi Penelitian Pendidikan. Jakarta : Bumi Aksara
SIZE SAMPLE PILOT TESTING
According to Connelly (2008), extant literature suggests that a pilot study sample should be 10% of
the sample projected for the larger parent study. However, Hertzog (2008) cautions that this s not a
simple or straight forward issue to resolve because these types of studies are influenced by many
factors. Nevertheless, Isaac and Michael (1995) suggested 10 – 30 participants; Hill (1998)
suggested 10 to 30 participants for pilots in survey research; Julious (2005) in the medical field, and
van Belle (2002) suggested 12; Treece and Treece (1982) suggested 10% of the project sample
size. I would say that 10 would be a minimum, and 30 might be considered in your project sample
size is expected to be 300.
Refs.
Connelly, L. M. (2008). Pilot studies. Medsurg Nursing, 17(6), 411-2.
Hertzog, M.A. (2008). Considerations in determining sample size for pilot studies. Research in
Nursing & Health, 31,180-191.
Hill, R. (1998). What sample size is “enough” in internet survey research? Interpersonal Computing
and Technology: An Electronic Journal for the 21st Century, 6(3-4).
Isaac, S., & Michael, W. B. (1995). Handbook in research and evaluation. San Diego, CA:
Educational and Industrial Testing Services.
Julious, S. A. (2005). Sample size of 12 per group rule of thumb for a pilot study. Pharmaceutical
Statistics, 4, 287-291.
Treece, E. W., & Treece, J. W. (1982). Elements of research in nursing (3rd ed.). St. Louis,
MO:Mosby.
van Belle, G. (2002). Statistical rules of thumb. New York: John Wiley.
ABOUT ELAN
ELAN is free, open source software which allows you to add text notes (annotations) to
video or audio recordings. We use ELAN as a tool which helps us to describe what is
happening in a recording.
When we create quality audio and video recordings, we get a great record of a
language being used - but we don’t have a way to easily access the language for
people who don’t speak it. ELAN gives us a chance to write down what is happening in
the recording, through the annotations, and can make these recordings more
accessible.
The annotations can be in different forms, depending on your goals. You can write down
sentences, individual words, or glosses (breaking the words up into smaller parts). You
can translate what is being said into a different language, write comments on what is
happening in the video, or describe other non-verbal things like gestures or body
language. What you choose to write is up to you: in the end, it will add more information
to your recording, which makes it a richer and more useful resource, for either yourself,
or for people who work on the language in the future.
You can then use these annotations for different purposes: you can search through
them for specific words; export the text to be used in different programs or as subtitles;
print the annotations; or simply review them in ELAN. The annotations are also time-
aligned, which means that each annotation will have a record of where it occurs in the
recording, so you can then listen to the relevant part of the recording.
This video series leads the viewer through ELAN - from downloading and installing the
software (onto a PC), putting your recordings into the right file formats (.wav files allow
you to visualise the sound recording), to creating your own transcription and then
exporting it for different uses. If you have a request for a video, please email
suggestions to RUIL-contact@unimelb.edu.au.
ELAN is produced by Max Planck Institute for Psycholinguistics, The Language Archive,
Nijmegen, The Netherlands. For more information on ELAN and for further help on how
to use it, as well as the links for downloading, please visit The Language
Archive ELAN web page.
SAMPLE SIZE FOR PILOT TESTING
According to Connelly (2008), extant literature suggests that a pilot study sample should be 10% of
the sample projected for the larger parent study. However, Hertzog (2008) cautions that this s not a
simple or straight forward issue to resolve because these types of studies are influenced by many
factors. Nevertheless, Isaac and Michael (1995) suggested 10 – 30 participants; Hill (1998)
suggested 10 to 30 participants for pilots in survey research; Julious (2005) in the medical field, and
van Belle (2002) suggested 12; Treece and Treece (1982) suggested 10% of the project sample
size. I would say that 10 would be a minimum, and 30 might be considered in your project sample
size is expected to be 300.
Refs.
Connelly, L. M. (2008). Pilot studies. Medsurg Nursing, 17(6), 411-2.
Hertzog, M.A. (2008). Considerations in determining sample size for pilot studies. Research in
Nursing & Health, 31,180-191.
Hill, R. (1998). What sample size is “enough” in internet survey research? Interpersonal Computing
and Technology: An Electronic Journal for the 21st Century, 6(3-4).
Isaac, S., & Michael, W. B. (1995). Handbook in research and evaluation. San Diego, CA:
Educational and Industrial Testing Services.
Julious, S. A. (2005). Sample size of 12 per group rule of thumb for a pilot study. Pharmaceutical
Statistics, 4, 287-291.
Treece, E. W., & Treece, J. W. (1982). Elements of research in nursing (3rd ed.). St. Louis,
MO:Mosby.
van Belle, G. (2002). Statistical rules of thumb. New York: John Wiley.
Isaac and Michael (1995) suggested that “samples with N’s between 10 and 30 have many
practical advantages” (p. 101), including simplicity, easy calculation, and the ability to test
hypotheses, yet “overlook weak treatment effects.”
Treece and Treece (1982), referring to piloting an instrument, noted that for a project with
“100 people as the sample, a pilot study participation of 10 subjects should be a reasonable
number” (p. 176) but were not clear whether this meant 10 cases or 10% of the project sample
size.
In their two seminal papers (Shapiro & Wilk, 1965; Shapiro, Wilk, and Chen, 1968) they only
simulated data with a maximum N of 50. So they seemed to concentrate on improving the test power
for small sample sizes. In turn this means that the K-S seemd to work quite well for large sampe
sizes. And this is what is recommended in text books: small sample sizes-> use S-W, large sample
sizes -> use K-S(with Lillefors adjustment).
But your quesstion could also be extended: why use K-S or S-W at all, in their 1968 paper, they
tested 9 different approaches to test normality. I think it is cumbersome to argue for or against each
of them. K-S and S-W are somewhat the standard, but as Juan Carlos and I explained, with large N,
they will significant probably almost every time, although the distribution is quite normal. Do not rely
only on this parameters, but have a look at the data itself and its distribution.
As Howell for example argued, the K-S test is of no use at all!
Howelll (2013) Statistical Methods for Psychology. Wadsworth.
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality
(complete samples). Biometrika , 52(3/4), 591–611.
Shapiro, S. S., Wilk, M. B., & Chen, H. J. (1968). A comparative study of various tests for normality.
Journal of the American Statistical Association, 63(324), 1343–1372
Pengguna internet terbesar masih didominasi penduduk di Pulau Jawa sekitar
57,70 %. Diikuti oleh Sumatera yang mengalami adopsi internet sebesar 47,29
%. Wilayah yang paling sedikit mengalami penetrasi internet berada di Maluku-
Papua dengan presentase 41,98 %. Sementara itu, penetrasi internet di wilayah
urban sudah mencapai 72,41 % sementara di wilayah urban-rural (wilayah tier
kedua) hampir mencapai setengah populasi yakni 49,49 %. Namun di wilayah
rural masih lebih kecil yakni 48,25 %.
Tahapan dalam model pembelajaran interaktif menurut Faire dan Cosgrove dalam Harlen
(1996: 28) terdiri dari persiapan pengetahuan awal, kegiatan eksplorasi, pertanyaan
siswa, penyelidikan, pengetahuan akhir dan refleksi.
Disabilitas responden
Responden yang berpartisipasi dalam penelitian ini sebagaian memiliki keterbatasan yaitu memakai
kacamata. Hal ini menjadi hambatan dalam pengambilan data menggunakan eye tracking. Pantulan
pada kacamata menyebabkan hasil yang didapatkan tidak akurat.
TA KEKE
Item pada QUIS diolah rata-rata sebab merupakan skala interval (Chin, dkk, 1988).
Oleh karena itu, dilakukan pembagian jumlah kesalahan yang terjadi dengan total kemungkinan
kesalahan untuk mendapatkan persentase kesalahan untuk setiap enis kesalahan (Tullis & Albert,
2013).
Menurut Barnum (2011), I also like this defi nition because it focuses on the critical measures of
usability:
● effectiveness
● effi ciency
● satisfaction
What these researchers gave us is evidence that small studies can be highly effective.
Tullis&Albert
For example, the center of a skewed distribution, like income, can be better measured by the median
where 50% are above the median and 50% are below. If you add a few billionaires to a sample, the
mathematical mean increases greatly even though the income for the typical person doesn’t change.
When your distribution is skewed enough, the mean is strongly affected by changes far out in the
distribution’s tail whereas the median continues to more closely reflect the center of the distribution.
For these two distributions, a random sample of 100 from each distribution produces means that are
significantly different, but medians that are not significantly different.
If you don’t meet the sample size guidelines for the parametric tests and you are not confident that
you have normally distributed data, you should use a nonparametric test. When you have a really
small sample, you might not even be able to ascertain the distribution of your data because the
distribution tests will lack sufficient power to provide meaningful results.
In this scenario, you’re in a tough spot with no valid alternative. Nonparametric tests have less power
to begin with and it’s a double whammy when you add a small sample size on top of that!
Reason 3: You have ordinal data, ranked data, or outliers that you can’t remove
Typical parametric tests can only assess continuous data and the results can be significantly
affected by outliers. Conversely, some nonparametric tests can handle ordinal data, ranked data,
and not be seriously affected by outliers. Be sure to check the assumptions for the nonparametric
test because each one has its own data requirements.
If you have Likert data and want to compare two groups, read my post Best Way to Analyze Likert
Item Data: Two Sample T-Test versus Mann-Whitney.
USAHA MENTAL
The intensity of mental effort can be considered as an index of mental workload (Paas, 1992a, 199b). Mental effort may be
defined as the total amount of controlled cognitive processing in which a subject is engaged.
GADA LOGOUT
https://www.maketecheasier.com/logout-from-website-with-no-logout-button/
Most sites put their login/logout buttons/links in the top right corner, either
standalone or in a menu, such as in account, settings, profile, or something
similar. If you don’t see it directly, try browsing the menus or hover.
http://webdesign-review.blogspot.com/2014/04/were-has-to-be-log-out-button.html
Then I was looking “log out” at the end of submenu list because unconsciously I realize that logout is
the end so it has to be at the end of the list.
If we don’t want to show “log out” we have to put it at the end of the list like on couchsurfing.org web
site.
Contoh: Gmail, Facebook, Youtube, Yahoo, Ruangguru (aplikasi sejenis)
4. Waktu awal nyusun tugas, mau masukin logout tapi logoutnya udah oke
dan ga terlalu bermasalah. Kalau pun ada saran, logoutnya dikeluarin. Tapi
pas ditanya, dari pihak sananya gakan ngeluarin logout, mau tetep dalem
menu profil aja biar terorganisir dan menu bar ga penuh.