You are on page 1of 7

TikTok Project Rules

AUDIO CHARACTERISTICS

Take note that, unlike other TCS projects, all tasks have their respective pre-cuts already.
We are still free to adjust it if it is necessary. However, we are only allowed adjust the cut inside
the default pre-cut.

Based on the test that we took, audios have different background sounds while the speaker is
talking.
a. Silent or pure dead air, (you can only hear the speech)
b. Regular Environmental sounds, (normal, plain and not noisy)
c. Melodies (pure music only, no singing or humming voice sound).

These sounds are heard sometimes all through out, or just on some parts of the audio. Keep in
mind that we will just ignore these sounds and consider it as plain background sound, we don’t
need to consider these sounds for our judgment on how we are going to transcribe the audio.

Some audios have unusual background sounds while the speaker is talking,
a. Another person/s talking
b. Noise, (environmental sound in high volume)
c. Someone is singing or humming at the background
Therse background sounds affects our judgement on how are we going to transcribe the audio,
we will discuss more about it on “Overlapping Rules” part of this sheet.

AUDIO CLASSES

Non-discard - tag a task as “Non-discard” if you can get a valid audio from it
a. The whole audio is considered valid based on the General rules below
b. You can get a part of the audio that is considered valid

Take note*** (you will get Valid-time from these kind of tasks)

Discard - tag a task as “Discard” if there is no valid audio that you can get from it
a. The entire audio is considered invalid.

Take note*** (you will not get Valid-time from these kind of tasks)
GENERAL RULES:

We need to consider the cutting rules as per the below rules:

1. Unclear parts or words which are not in the English vocabulary has to be cut out.

2. You are free to apply cuts without worrying about the idea of a sentence.

3. If there are overlapping speech (two speakers talking at the same time), apply the
following:
a. Discard – if two speakers are talking at the same time but with different words
all throughout the audio.
b. Cut – if only a part of the audio has two speakers talking at the same time about
having different words or ideas, cut this out. Intercept the clear solo speech only.

OVERLAPPING RULES:

1. If two or more speakers clearly speak the same words at the same time, these may
be transcribed as one.

2. If two or more speakers are not talking at the same time (non-overlapping), you
may regard this as regular speech (as if people having a conversation with one
another). Transcribe every word normally for both speakers.

3. If two or more speakers are talking at the same time (overlapping), and there is one
clear voice that stands out while the other voices are low or fuzzy, transcribe the
main voice only. You should disregard the low/fuzzy voice as long as it does not
affect the clarity of the main voice.

4. Do not transcribe songs, animal sounds, or any non-human voice. Three examples:
a. If the “ENTIRE” audio is a singing or non-human voice (i.e.: singing, humming
melodies, sound of animals, please discard)
(Note: if a speaker says “the cat says meow” – cut out “meow”)
b. If there is a part where the audio is only “Singing”, cut this out and save only
the clear human speech.
Exception: if there’s someone singing in the background but is too low and
does not affect the clarity of the human speech, you can transcribe the
speech and ignore the background sound.
c. If the background sound is just pure music while human speech is present,
you can transcribe the speech. (Ex: Speech on meditation)
5. If the audio has a loud noise which disrupts the clarity of the main voice, remove this
part. Three situations to consider:
a. DISCARD - If the whole duration of the audio is full of noise
b. CUT – if the noise affects only a part of the audio, only cut this part out.
c. NO CUT(S) & TRANSCRIBE – if the noise all throughout the audio does not
affect the clarity of the human speech, you can transcribe the speech and
disregard the noise as a whole.

6. When the audio is a conversation and there are pauses or noises in the middle, no
need to cut the noise out. You can maintain this.

Example: Speech 1 + pause/noise + Speech 2


Transcribe as: Speech 1 and Speech 2

TEXT RULES:

1. Apply spacings between the words.

2. No punctuations except apostrophes and hyphens.

3. Numbers and symbols should be written in words.

Example: 1 2 3 11 21 xyln@yahoo.com
Transcribe as: one two three eleven twenty-one X Y L N at Yahoo dot com

4. If there are half-pronounced words, apply the following rules:

a. If the half-pronounced word at the start or end is not complete, cut this out.
Example: Actual speech is “I am an American”
Interception says: “I am an Ame” – cut out the incomplete word.
Transcribe as: “I am an”

b. If the half-pronounced word at the start or end is not complete, but forms a
different word, you can consider this.
Example: Actual speech is “I went to the supermarket”
Interception says: “I went to the super” – do not apply any cuts.
Transcribe as: “I went to the super
c. If the half-pronounced word is in the middle part, whether or not a separate
word, do not intercept and handle in the following situations:
i. If the half-pronounced word is not a separate word, you can disregard it.
Example: do you st still love me
Transcribe as: do you still love me

ii. If the half-pronounced word is caused by a stuttering sound but forms a


different word, you can transcribe it.
Example: I am look look looking for my book.
Transcribe as: I am look look looking for my book.

5. Modals and laughter

Note: let's treat Modals and laughter as different categories and should be treated
differently.

A. Modals-
Located at Start/End = if it CAN BE COUNTED, we can only write “UP TO 2” modals.
Example: um um hello there ah uh
uh huh yes you're right hmm hmm

Located at Start/End = if it CANNOT BE COUNTED, CUT IT OUT

Located at the Middle = always write, do not ignore

B. Laughter-

Located at Start/End = (same rules with modals)


Can be counted = UP TO 2 “ha ha”
Cannot be counted = cut it out

Located at the Middle = (different approach compared to other modals)

if it CAN BE COUNTED = write "ha" and put spaces in between each repetition
Example: 2x = ha ha
Example: 6x = ha ha ha ha ha ha

if it CANNOT BE COUNTED = we uniformly write "hahaha" (no space in between)


Example: 3x = that’s funny hahaha tell me more
LIST OF MODALS:
we have to uniformly use these modals. Using different spelling for sounds that may
correspond to one of the following is not allowed. In case you encounter a certain
modal sound that doesn’t match any of our modals from the list, we should confirm it
with the client first and wait for feedback on how to write it. Expect this list to be
updated from time to time.

Modal
ah,ahah,bah,dah,eww,yeah,oh,hey,hi,ho,oops,shh,wow,uh
particle
-huh, wah ,yah, yep,nope

Filler Modals er,hmm,huh,uh,um

6. 2-word rule, an audio should at least have 2 words in it. Otherwise, discard!

(expect this rule to be updated soon. We are waiting for client’s confirmation if
modals or spoken letters will also add to the word count)

7. Transcribe English words only. If there are Non-English words involved, cut it out.

Example: “if there are non-english words it yīnggāi jiǎn diào”

Transcribe as: “if there are non-english words it”


-cut out the remaining part

However, there are cases where it may sound like a foreign word, but you may consider
it as proper noun. You may write it and Apply capitalization.

Example: I studied at jawaharlal state university

Transcribe as: “I studied at Jawaharlal State University”

8. When there are homophones (same sounding words but different meaning), and you
are unsure which of the words to use, apply the following rules:
i. Listen only to the default pre-cut part. Apply the word with proper
context or idea. No need to use the text outside the pre-cut part as
reference.
ii. When undecided between words such as “hole” or other words that
sounds like /həʊl/, and the interception says: “the whole town loves him”
you can use “whole” which fits the context of the sentence.
iii. If there are multiple homophones that can fit the context, you can use
any of them. Example: “where is my dear/deer” You can transcribe as
below:
Transcription 1: where is my dear
Transcription 2: where is my deer
Both having the correct context so you can choose one from any of them.

9. You can use common informal terms, such as ‘wanna, tryna, gonna, etc.’
Want to = want to

10. Verbs ending in -ing


- Since this is a speech project, refrain from using apostrophe instead of letter g
in the end, like what we did in Lyric project. We are going to apply this rule
regardless of the sound.
Proper writing: eating kicking sitting
Incorrect: eatin’ kickin’ sittin’

FAQS:

Q: How do we handle abbreviations or spoken letters as they call it?


How do we handle spelled-out words?
A: For Abbreviations or Spoken Letters, we have 2 approaches
1. If it is a popular Abbreviation, apply Capitalization and do not add space in
between each letter
Example: USA , NBA , VIP
2. If it is not a popular Abbreviation, or just a random Letter sequence that you
are unfamiliar with, apply Capitalization and add space in between each
letter
Example: X R P , S S D , C I H

For words that are spelled out, apply Capitalization and space in between
Example: A P P L E , T A B L E , P L A N E T
Q: Can we use punctuations?
A: We only apply hyphens and apostrophes as commonly used. No other punctuations
are allowed.

Q: Are we allowed to have spaces at the beginning or end? How about double spaces in
between words?
A: both can be tolerated.

Q: If there’s a small part in the middle with overlapping speeches and both having the
same volume, how do we handle this?
A: No need to cut if both speeches say the same thing. Cut if they say different words.

Q: Can we transcribe the speaker when he/she is singing?


A: No, do not transcribe.

Q: There’s a long duration of uncountable laughter in the middle part. How do we


handle it?
A: We uniformly write “hahaha” if it is uncountable in the middle part.

Q: Do we have to capitalize letters?


A: Only apply capitalization for spoken letters and proper nouns.

Q: The speaker is imitating an animal sound. Should this be transcribed?


A: No, animals sounds from human speech should NOT be transcribed

Q: We can only hear modal words in the intercepted audio. Should we transcribe?
A: No. we should discard.

Q: There’s a long silence, plain environmental sound or pure music heard at the
beginning, middle or at the end. Should we cut?
A: No need to cut. There are no restrictions on the duration.

You might also like