Professional Documents
Culture Documents
1. Vad line is the dividing line. ( The purpose of adding dividing line is to
remove unnecessary and invalid content and to mark valid vocal parts.)
The white line in the interface is vad line which can be moved, added and
deleted. You need to adjust the vad line to a reasonable position and modify the
text in the content layer. The audio must be highly consistent with the text within
the boundary.
a. Add the vad line: Put the cursor in the appropriate position, and then click
[+vad] when the red dotted line appears after clicking.
b. Add two vad lines at the same time : A and B need to be add vad lines so that
drag the cursor to choose this part, then click <+vad>
c. Delete the vad line: drag the cursor and choose the part the needed to be
deleted, then click <-vad>
2. Labeling Principles
a. You need to modify the text
strictly, adjust and add vad lines and
add corresponding labels according to the audio content and this guideline to make
each vad segment consistent words and sounds.The original transliteration results
shown by the machine are only for reference and need your modification.
b. A vad segment can only contain one person’s content of speech. The content
spoken by different people need to be divided.
Dividing Principles:
You should try to divide at punctuation to ensure the relative integrity of the
sentence.
a.restriction of character (a maximum of 120 characters)
b.vad duration (within 10 seconds of a single vad)
c.different speakers
d.part cannot be marked.
Invalid duration means the part that cannot be marked. You need to add the vad line
to determine the range of the area and choose the corresponding label for it.
When there is more than 1s of noise, mute, inaudible, multi-person talk and so on,
we also need to cut it out and label it accordingly.
The same goes for applause, laughter, non-Malay language, advertisement, songs with
voices that are clearly repeated in films or television, pure music without voices,
etc.
Capitalized the first letter of a sentence, proper noun, person's name, place name,
etc
b. Punctuation:
You can only modify according to the meaning of the sentence, not according to the
paragraph. (For example, if a sentence is too long and divided into two lines, do
not capitalize the first letter of the second line, and please write the
punctuation at the end of the first line or no punctuation at all.)
number, identity card number, year, month, etc.). For example, 911 should be
b. Except for / 、and %, most of special symbols can be transcribed in the original
language. For example, 3/5 should be written as tiga perlima and 90% should be
written as sembilan puluh peratus or Sembilan puluh persen according to the actual
4. Pronunciation Repeat
a. You need to transcribe the entire word as many times as it repeat according to
actual pronunciation.
b. If the sound is not the whole word but a single syllable of the word, you don’t
5. Modal Particle
a. You need to transcribe complete and existing modal words, interjections and
6. URL
According to the pronunciation, you transcribe Malay if the audio is in Malay and
actual content of the audio. For example, ni does not need to be changed to ini,
8. Bad Data
If the whole audio contains only music, quiet noise, pure noise, laughter,
especially low inaudible sound, foreign language dubbing, non-Malay language, etc.,
then the whole data is invalid, you need to directly click the button "mark as bad
data".
9. Voice Overlap
voices should not be transcribed in one line. If it is one in front and one in
back, you need to divide it into two lines according to the pronunciation order
b. You need to ignore the secondary speaker and transcribe the main speaker when
the voice of the main speaker and the secondary speaker (less speech, little
impact) overlap and cannot be segmented.Besides, you need to mark <overlap> when
the voice of the main speaker and the secondary speaker (much speech, much
1. Invalid Data
If more than 80% of the whole audio cannot be marked normally, such as sil, noise,
non-Malay, human voice that you cannot understand, etc., then the whole data will
be invalid, and you can directly click the button "mark as Invalid Data"
Count labels for invalid duration--The parts that cannot be transcribed need to be
segmented and labeled, and then the text layer automatically jumps to the label
without transcribing the text:
a:
1)If there is background noise, music, etc. in the section and it does not affect
2)Try to keep the content of each VAD paragraph relatively complete when you
divide.
3)For the first 7 tags above, when the tag is finished, it means that this
paragraph has no valid Malay voice, so it is invalid, and there is no need to make
b:
successively, the whole part of the audio can be divided and marked as <DEAF>
If noise, sil, etc. occurs at the same time, the whole can be divided and
marked as <NOISE>;
If noise, deaf, sil and so on appear at the same time, they can be divided and
marked as <DEAF>.
2) <BGM>:
When you can hear the speaker clearly with background music with lyrics and
speakers, you need to mark the speaker's pronunciation and add the tag <BGM> to
this section.
Transcription method: mark the speaker's pronunciation as normal and select <BGM>
label for this section.
3) <CONTINUE>:
If more than one second of noise is causing the full sentence to be segmented, you
should add a <CONTINUE> tag to the next section until the sentence attached to the
<CONTINUE> tag is complete.
Transcription method:If a complete pronunciation content is interrupted by noise
for more than 1 second, it is necessary to mark the noise part with the
corresponding <NOISE> label, and mark the next conversation with the <CONTINUE>
label. The two paragraphs before and after the noise part are a complete
pronunciation content.
Here you should combine 22/23/24/25 into one paragraph and tag it with <noise>
6. Labels should be selected in role layer 1, please do not write labels manually.
The wrong sample:
The right sample:
Ⅵ. Good Sample
The following figure provides examples of specific segmentation, labeling and content layer
labeling for your reference:
When hovering the mouse, 4 types of all, good data, bad data and
all data unmarked are displayed. Click on different types, and the
corresponding marked data will be displayed on the page.
labeled as bad data Click [Mark as Bad Data]: mark this piece of data as bad data.
Click [Previous] and [Next] to enter the previous and next marked
Up/down
files.
The recovery time is 120h, and the recovery time of the return
recycle time
mission is 120h.
Select the areas on the left and right sides of the boundary
merge vad segment line, right-click to merge the segments, or click "-vad", or use
the keyboard shortcut C.
Place the indicator line where you need to add a segment, click
add vad segment to add a vad segment, a vad segment will be added after the
indicator line, or use the keyboard shortcut S.
font size
adjustment(A) After clicking, you can select the appropriate font size.
play in a ①By default, the platform will pause after playing a vad segment
loop/continuously by default;
②Loop playback can choose to loop once, twice or three times,
and the playback will stop after the current vad loop is
completed; the selected loop times are applicable to the entire
audio;
③You can also choose to play continuously (-). After one vad is
played, the next vad segment will be played automatically until
the full length of audio is played.
Annotate according to the audio, modify the original text in the
content layer
content layer, and do not support line breaks.
Click the plus and minus signs to zoom the audio field of view,
audioscale
the maximum ratio is 50 times.
Role attributes, gender attributes, or both are different for
role layer
different tasks and are used to add relevant tags.