Professional Documents
Culture Documents
AUDIO CHARACTERISTICS
Take note that, unlike other TCS projects, all tasks have their respective pre-cuts already.
We are still free to adjust it if it is necessary. However, we are only allowed adjust the cut inside
the default pre-cut.
Based on the test that we took, audios have different background sounds while the speaker is
talking.
a. Silent or pure dead air, (you can only hear the speech)
b. Regular Environmental sounds, (normal, plain and not noisy)
c. Melodies (pure music only, no singing or humming voice sound).
These sounds are heard sometimes all through out, or just on some parts of the audio. Keep in
mind that we will just ignore these sounds and consider it as plain background sound, we don’t
need to consider these sounds for our judgment on how we are going to transcribe the audio.
Some audios have unusual background sounds while the speaker is talking,
a. Another person/s talking
b. Noise, (environmental sound in high volume)
c. Someone is singing or humming at the background
Therse background sounds affects our judgement on how are we going to transcribe the audio,
we will discuss more about it on “Overlapping Rules” part of this sheet.
AUDIO CLASSES
Non-discard - tag a task as “Non-discard” if you can get a valid audio from it
a. The whole audio is considered valid based on the General rules below
b. You can get a part of the audio that is considered valid
Take note*** (you will get Valid-time from these kind of tasks)
Discard - tag a task as “Discard” if there is no valid audio that you can get from it
a. The entire audio is considered invalid.
Take note*** (you will not get Valid-time from these kind of tasks)
GENERAL RULES:
1. Unclear parts or words which are not in the English vocabulary has to be cut out.
2. You are free to apply cuts without worrying about the idea of a sentence.
3. If there are overlapping speech (two speakers talking at the same time), apply the
following:
a. Discard – if two speakers are talking at the same time but with different words
all throughout the audio.
b. Cut – if only a part of the audio has two speakers talking at the same time about
having different words or ideas, cut this out. Intercept the clear solo speech only.
OVERLAPPING RULES:
1. If two or more speakers clearly speak the same words at the same time, these may
be transcribed as one.
2. If two or more speakers are not talking at the same time (non-overlapping), you
may regard this as regular speech (as if people having a conversation with one
another). Transcribe every word normally for both speakers.
3. If two or more speakers are talking at the same time (overlapping), and there is one
clear voice that stands out while the other voices are low or fuzzy, transcribe the
main voice only. You should disregard the low/fuzzy voice as long as it does not
affect the clarity of the main voice.
4. Do not transcribe songs, animal sounds, or any non-human voice. Three examples:
a. If the “ENTIRE” audio is a singing or non-human voice (i.e.: singing, humming
melodies, sound of animals, please discard)
(Note: if a speaker says “the cat says meow” – cut out “meow”)
b. If there is a part where the audio is only “Singing”, cut this out and save only
the clear human speech.
Exception: if there’s someone singing in the background but is too low and
does not affect the clarity of the human speech, you can transcribe the
speech and ignore the background sound.
c. If the background sound is just pure music while human speech is present,
you can transcribe the speech. (Ex: Speech on meditation)
5. If the audio has a loud noise which disrupts the clarity of the main voice, remove this
part. Three situations to consider:
a. DISCARD - If the whole duration of the audio is full of noise
b. CUT – if the noise affects only a part of the audio, only cut this part out.
c. NO CUT(S) & TRANSCRIBE – if the noise all throughout the audio does not
affect the clarity of the human speech, you can transcribe the speech and
disregard the noise as a whole.
6. When the audio is a conversation and there are pauses or noises in the middle, no
need to cut the noise out. You can maintain this.
TEXT RULES:
Example: 1 2 3 11 21 xyln@yahoo.com
Transcribe as: one two three eleven twenty-one X Y L N at Yahoo dot com
a. If the half-pronounced word at the start or end is not complete, cut this out.
Example: Actual speech is “I am an American”
Interception says: “I am an Ame” – cut out the incomplete word.
Transcribe as: “I am an”
b. If the half-pronounced word at the start or end is not complete, but forms a
different word, you can consider this.
Example: Actual speech is “I went to the supermarket”
Interception says: “I went to the super” – do not apply any cuts.
Transcribe as: “I went to the super
c. If the half-pronounced word is in the middle part, whether or not a separate
word, do not intercept and handle in the following situations:
i. If the half-pronounced word is not a separate word, you can disregard it.
Example: do you st still love me
Transcribe as: do you still love me
Note: let's treat Modals and laughter as different categories and should be treated
differently.
A. Modals-
Located at Start/End = if it CAN BE COUNTED, we can only write “UP TO 2” modals.
Example: um um hello there ah uh
uh huh yes you're right hmm hmm
B. Laughter-
if it CAN BE COUNTED = write "ha" and put spaces in between each repetition
Example: 2x = ha ha
Example: 6x = ha ha ha ha ha ha
Modal
ah,ahah,bah,dah,eww,yeah,oh,hey,hi,ho,oops,shh,wow,uh
particle
-huh, wah ,yah, yep,nope
6. 2-word rule, an audio should at least have 2 words in it. Otherwise, discard!
(expect this rule to be updated soon. We are waiting for client’s confirmation if
modals or spoken letters will also add to the word count)
7. Transcribe English words only. If there are Non-English words involved, cut it out.
However, there are cases where it may sound like a foreign word, but you may consider
it as proper noun. You may write it and Apply capitalization.
8. When there are homophones (same sounding words but different meaning), and you
are unsure which of the words to use, apply the following rules:
i. Listen only to the default pre-cut part. Apply the word with proper
context or idea. No need to use the text outside the pre-cut part as
reference.
ii. When undecided between words such as “hole” or other words that
sounds like /həʊl/, and the interception says: “the whole town loves him”
you can use “whole” which fits the context of the sentence.
iii. If there are multiple homophones that can fit the context, you can use
any of them. Example: “where is my dear/deer” You can transcribe as
below:
Transcription 1: where is my dear
Transcription 2: where is my deer
Both having the correct context so you can choose one from any of them.
9. You can use common informal terms, such as ‘wanna, tryna, gonna, etc.’
Want to = want to
FAQS:
For words that are spelled out, apply Capitalization and space in between
Example: A P P L E , T A B L E , P L A N E T
Q: Can we use punctuations?
A: We only apply hyphens and apostrophes as commonly used. No other punctuations
are allowed.
Q: Are we allowed to have spaces at the beginning or end? How about double spaces in
between words?
A: both can be tolerated.
Q: If there’s a small part in the middle with overlapping speeches and both having the
same volume, how do we handle this?
A: No need to cut if both speeches say the same thing. Cut if they say different words.
Q: We can only hear modal words in the intercepted audio. Should we transcribe?
A: No. we should discard.
Q: There’s a long silence, plain environmental sound or pure music heard at the
beginning, middle or at the end. Should we cut?
A: No need to cut. There are no restrictions on the duration.