You are on page 1of 6

CONFIDENTIAL – YouTube Video Translation and Commentary of Offensive Content

Task Description and Tools


Main goal: Translate YouTube videos from your native language into English. Also, please provide comments on sections which content (including slang) might be
hateful, abusive, racist, or inciting violence.

The translation will not be displayed on YouTube. The purpose of this work is to give the Google Analyst an English translation, so they can understand the video
and make decisions at Google about the potential policy violations.

The client would like to understand metaphors, slurs, or slang that might be regarded as hateful, abusive, racist, or violent. As a result, when translating, take into
consideration your understanding of the culture and context. It is also important for the client to understand any acronyms used. Please explain what acronyms
mean and offer the equivalent in English if it exists.

Tools: We offer you a choice between using a speech to text tool (1), which means dictating your translation, or a manual process (2) of typing your translation
in a .docx file.

Time for this task


Please aim to complete the work as quickly as you can within the TAT specified in the HO, with the best level of quality you can provide, and report back the time
spent.

Task instructions:
(1) Speech to Text workflow

Please watch a short instructional video available on Drive – the video lasts only 2.5 minutes.
Open the link for the YouTube video in your browser. We recommend the usage of Speech To Text tool such as https://speechnotes.co/.
However, feel free to work with your preferred tool as long as it allows you to save the script as .txt.

Play the video on YouTube, pausing every few sentences to record your translation in your STT tool using your computer microphone, so that the text is captured
by the tool.

Please say the phrase “Timestamp”, say the minute and second within the video, and proceed to translate.

Timestamps are required for each translation as defined at a minimum below:


 every 10 seconds for music videos if there are actual words spoken/sung
 every minute for monologues/dialogues if there are actual words spoken/sung
 no timestamps when nothing is being spoken

If you identify vulgar language, metaphors, slurs, or slang that might be regarded as hateful, abusive, racist, or violent, please comment orally on the source
segment that is problematic and share an explanation of the reasons why the content might be offensive. For such purposes, please say the phrase “Interpreter
annotation” and share your input.

If any segment is inaudible in the video, please say the phrase “Timestamp”, say the time within the video, and then the word “inaudible”.

If 2 or 3 people are speaking at the same time, please translate what they all say, using one timestamp, e.g.
Timestamp X minutes X seconds - Speaker 1 - Speaker 2 - Speaker 3
You can use an EM-dash for indicating the speaker change. The order doesn’t really matter – the client needs to know what is said in the video.
Additionally, please insert an annotation that all speakers are speaking exactly at the same time.

Where no spoken dialogue is present but there is onscreen text, please translate the onscreen text. Please add a prefix [OST], and follow the standard
procedure.
Where both spoken dialogue and onscreen text are present, you should receive specific guidance on what is to translate. If that is not provided, please flag to
the PM team.

We are only expected to provide linguistic and cultural facts, no personal opinions.

Example 1:
Translator: Say “Timestamp one minute twenty seconds” [Translation here].

2
Example 2:
Translator: Say “Timestamp two minutes four seconds” [Translation here]. Say “Interpreter Annotation” [commentary here if acronym, hateful, abusive, racist, or
violent.]

What your output should sound like – example:


Timestamp zero minutes ten seconds. That girl looks so ghetto. Interpreter Annotation The word ghetto is slang for lower class, not having money or nice things.
Timestamp zero minutes twelve seconds. I don’t like her outfit.

Format:
- The term “Timestamp” should always be one word. We will use it as a placeholder for tech conversion onto the final delivery format requested by the
customer.

3
- The format of the Timestamp should always read as minutes, seconds (in plural), as in “One minutes four seconds”, “Three minutes thirteen seconds” or “One
hour fifty three minutes ten seconds”, for instance.
- Correct formats of the timestamps are:
Timestamp two minutes four seconds – without any punctuation mark
Timestamp two minutes four seconds. – with a period
Timestamp two minutes four seconds: – with a colon
Timestamp two minutes and four seconds. – with “AND”
Timestamp 2 minutes 4 seconds. – with digits, not text
- The maximum chunk of the video between timestamps should be up to 1 minute.
- Your comments should always be preceded by the phrase “Interpreter Annotation”. We will use it as a placeholder for tech conversion onto the final delivery
format requested by the customer.
- The text “Interpreter annotation” doesn't need to be followed by any punctuation mark.
- Each interpreter annotation should be followed by a new timestamp to make it clear where the annotation ends and the interpretation begins.

(Important) Please review your text output before delivering your file
Please review/QA the output of your scripted translation and make sure that:
- The term “Timestamp” is always one word and the format (minutes, seconds) is correct.
- Your comments are always preceded by the phrase “Interpreter Annotation”. If you happen to miss saying that phrase during your translation, please make
sure you fix it in the text file by adding it.
- There are no spelling errors in placeholders. Note that any misspelling in the phrase "Timestamp" and "Interpreter annotation" is a critical issue that will lead
to incorrect conversion of the script.
- The final text is readable and free from major linguistic errors.
- You have commented all instances of vulgar / abusive / hateful / racist language / slang / acronyms.
- The slower and clearer you speak into the tool, the better text conversion output you will receive.

Deliverables
✔ Recorded complete session (*.txt file) using any Speech to Text tool (https://speechnotes.co/ or other of your preference).
Please use a video title as a .txt filename.
✔ A checklist in Excel format filled with your name (or agency name), language and “Signoff” column.

4
(2) Manual process
Please download the file NameofVideo_SourceLanguage.docx from Drive.

Open the link for the YouTube video in your browser. Play the video on YouTube, pausing every few sentences to type your translation in the .docx file.

Timestamps are required for each translation as defined at a minimum below:


 every 10 seconds for music videos if there are actual words spoken/sung
 every minute for monologues/dialogues if there are actual words spoken/sung
 no timestamps when nothing is being spoken

If you identify vulgar language, metaphors, slurs, or slang that might be regarded as hateful, abusive, racist, or violent, please comment on the source segment
that is problematic and share an explanation of the reasons why the content might be offensive. Please share your input in NOTES column in DOCX. Make sure to
insert only one annotation per line.

If any segment is inaudible in the video, please insert a relevant timestamp (minutes and seconds) and then the word “inaudible” in TRANSLATION column.

If 2 or 3 people are speaking at the same time, please translate what they all say using one timestamp, e.g.
Timestamp X minutes X seconds - Speaker 1 - Speaker 2 - Speaker 3
You can use an EM-dash for indicating the speaker change. The order doesn’t really matter – the client needs to know what is said in the video.
Additionally, please insert an annotation that all speakers are speaking exactly at the same time.

Where no spoken dialogue is present but there is onscreen text, please translate the onscreen text. Please add a prefix [OST], and follow the standard
procedure.
Where both spoken dialogue and onscreen text are present, you should receive specific guidance on what is to translate. If that is not provided, please flag to
the PM team.

We are only expected to provide linguistic and cultural facts, no personal opinions.

What your output should look like – example:

TIMESTAMP
TRANSLATION NOTES
(00:00 mm:ss)
0:05–0:11 That girl looks so ghetto. The word ghetto is slang for
lower class, not having money
or nice things
5
0:11 – 0:15 I don’t like her outfit.

(Important) Please review your text output before delivering your file
Please review/QA the output of your scripted translation and make sure that:
- The final text is readable and free from major linguistic errors.
- You have commented all instances of vulgar / abusive / hateful / racist language / slang / acronyms.

Deliverables
✔ DOCX file with your translation and annotations.
Please include a video title and a target language in DOCX filename.
✔ A checklist in Excel format filled with your name (or agency name), language and “Signoff” column.

References
✔ Slang: Slang is vocabulary that is used between people who belong to the same social group and who know each other well. Slang is very informal
language. It can offend people if it is used about other people or outside a group of people who know each other well.

✔ Slurs: Words to have multiple definitions, which opens the door for some words to be both derogatory and not derogatory, depending on who is using
them or when. Examples: monkey, ghetto, chink, etc.

✔ General information about YouTube requirements can be found in the following client links: YouTube Policies & Guidelines - How YouTube Works/YouTube
Community Guidelines & Policies - How YouTube Works

Thank you for your work on this special task!

You might also like