You are on page 1of 20

EXAMEN-NUEVAS-TECNOLOGIAS.

pdf

Doryyyto

Nuevas Tecnologías Aplicadas a la Lengua Inglesa

3º Grado en Estudios Ingleses

Facultad de Filología
Universidad de Sevilla

Reservados todos los derechos.


No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
EXAMEN NUEVAS TECNOLOGÍAS, JOSE GABRIEL AMORES CARREDANO

PREGUNTA 1
A corpus is a machine-readable collection of spoken or written texts to be used for ...
- Linguistic analysis
- Descriptive analysis
- Scientific analysis

PREGUNTA 2
ASCII is
- It is a special software that allows users to introduce different linguistic tags in
corpora texts.
- The acronym of an important corpus of spoken American language.
- An international standard data-transmission code that allows us to use the
different keyboard symbols in computers and other devices.

PREGUNTA 3
According to BNCWeb Simple Query Syntax features, the symbol + stands for:
- Zero or more characters.
- A single arbitrary character.
- All of them are incorrect.
- One or more characters.

PREGUNTA 4
All the individual, different words in the corpus are
- A token
- A type
- A tag

PREGUNTA 5
Balance in corpus refers to...
- The range of types of text in the corpus.
- The number of texts in the corpus.
- The range of languages used in the texts in the corpus.

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 6
Choose the correct option.
- If Mutual Information is approximately zero, then the two words tend to occur
dependently.
- If Mutual Information is positive and reasonably high (usually 2 or higher),

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
then the two words are strongly collocated.
- If Mutual Information is negative and reasonably high (usually -2 or higher),
then the two words are strongly collocated.

PREGUNTA 7
Choose the correct option.
- We know that language is regular (non-random) and rule-based.
- We know that language is irregular (random) and non rule-based.
- We know that language is regular (non-random) but not rule-based.

PREGUNTA 8
Collocations ...
- Help to discover patterns of variation across registers
- All the options are correct
- Help to distinguish between near-synonyms

PREGUNTA 9
Considering the following expression: [aeiou], you would be looking for...
- Any vowel, in lower or capital case.
- All the characters in sequence, surrounded by square brackets.
- Any character within the square brackets.
- Any character except for those within the square brackets.

PREGUNTA 10
Corpus annotation expresses
- Relatively objective factual information
- Absolutely objective factual information
- More subjective, interpretative information

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

¿RELLENANDO APUNTES?: RELLÉNATE UN TACO.


Nuevas Tecnologías Aplicadas...
Banco de apuntes de la
PREGUNTA 11
Do parsers use fewer categories than taggers?
- Yes, parsers use 12-20 categories compared to 60-100 by taggers.
- No, parsers use 60-100 categories compared to 12-20 by taggers.
- Both of them use the same number of categories.

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 12
For what sequence of words you can not do a query in BNC?
- An idiomatic expression.
- A grammatical construction.
- You can do a query for both.

PREGUNTA 13
HTML stands for...
- Hypertext Marking Language
- Hypertag Markup Language
- Hypertext Markup Language

PREGUNTA 14
How many bits is 1 megabyte?
- 10000 bits
- 1000 bits
- 100 bits
- 1000000 bits

PREGUNTA 15
How many tokens does this sentence have? “I want to go the cinema to watch the new
movie.”
- 10 tokens
- 12 tokens
- 9 tokens
- 11 tokens

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SUBRAYA ESTO: CÓMETE UN TACO.


PREGUNTA 16
How many versions does Unicode have?
- UTF
- UTF-4, UTF-32, UTF-8, and UTF-16
- UTF-32, UTF-16, and UTF-8

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- UTF-32, UTF-16

PREGUNTA 17
How many words does the British National Corpus (BNC) contain?
- Around 80 million words.
- Around 10 million words.
- Around 100 million words.
- Around 250 million words.

PREGUNTA 18
If a phrase is a collocation, we can't substitute a word in the phrase for a near-synonym,
and still have the same overall meaning.
- Only happens with certain collocations.
- False.
- True.

PREGUNTA 19
If one wants to find all but the -ing forms of the verb want, which one of these options
would be a successful regular expression?
- \want(?ing)\w*\b
- \bwant(?!ing)\w*\b
- \bwant(ing)\w*\b

PREGUNTA 20
If you use the following regular expression: s.ng, which of the following words could be
found?
- Assign
- Song
- Sting
- seeing

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SÓLO UN APUNTE MÁS: CÓMETE UN TACO.


PREGUNTA 21
In WebQuests,
- Internet searches are pre-selected by the instructor
- Students do their own search over the Internet
- Internet searches are selected by both the instructor and the students

PREGUNTA 22
In a regular expression, what does the "?" suffix mean?
- The preceding character is obligatory.
- The preceding character may appear with arbitrary repetition.
- The preceding character (or group of characters) is optional.
- The preceding character is optional

PREGUNTA 23
In binary theory, a single 0 or 1 is called a...
- Digit
- Bit
- Byte

PREGUNTA 24
In subtitles, it is ok to split Determiners from Nouns
- Verdadero
- Falso

PREGUNTA 25
In subtitles, it is ok to split a preposition from the rest of the noun phrase.
- Verdadero
- Falso

PREGUNTA 26
In subtitles, you are only allowed a maximum of ___ characters
- 36 to 40
- 50
- 25
- 45

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 27
In the BNC, the sources for reception were ...
- All of them
- None of them.
- Library lending statistics and bestseller lists and prizewinners.

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- Lists of books in print and library lending statistics.

PREGUNTA 28
It is ok to use Machine Translation in subtitling
- Verdadero
- Falso

PREGUNTA 29
Mark the correct statement
- Some theories are compatible with corpus-based, such as Brunner ́s Theory
- No theories of language are compatible with corpus-based research.
- None of the above
- Any theory of language is compatible with corpus-based research.

PREGUNTA 30
Mark the statement which is correct.
- A diachronic corpus may contain old-fashioned or unfamiliar words and
spellings
- A diachronic corpus cannot contain characters no longer exist.
- A synchronic corpus covers the study of historical and comparative language
studies.
- All the options are correct.

PREGUNTA 31
Morpho-syntactic annotation refers to
- Named entities, phrasal chunking, full syntactic analysis
- Part-of-speech (POS) tagging
- Inflection, derivation, compounding

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SÓLO UN APUNTE MÁS: CÓMETE UN TACO.


PREGUNTA 32
Most corpus analysis programmes present the results as:
- A concordance, a list of instances, matching the expression you search for.
- Only 50 examples matching the expression you search for.
- A browse with a list of random words.

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 33
Most corpus analysis programs present the results as:
- A concordance (KWIC)
- Raw Frequency order
- A graphic
- Normalized frequency

PREGUNTA 34
Nowadays, a corpus is ...
- Always printed.
- Printed on demand.
- Never printed.
- Very rarely printed.

PREGUNTA 35
Regarding character encoding in corpora,
- A corpus cannot be stored in plain text format
- There is no problem in encoding texts with Unicode.
- Old corpora stored in plain text are not readable any more after Unicode has
been established.
- Some languages may require a different character set

PREGUNTA 36
Regarding ranks and frequencies, choose the correct statement
- Among bottom ranks, frequency drops dramatically
- Among top ranks, frequency drops gradually
- Among top ranks, frequency drops very dramatically

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

¿RELLENANDO APUNTES?: RELLÉNATE UN TACO.


PREGUNTA 37
Regular expressions are a popular formalism for:
- Text transformation (search & replace)
- String validation (e.g. user input)
- Text search (finding strings in long text)

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- All the answers are correct

PREGUNTA 38
Semantic prosody
- Can be collocational or non-collocational
- Can be non-collocational
- Can only be collocational

PREGUNTA 39
Semantic prosody expresses the speaker / writer attitude or evaluation. It is ...
- Neutral with respect to its positive or negative meaning.
- Typically negative, with relatively few of them bearing an affective positive
meaning.
- Typically positive, with relatively few of them bearing an affective negative
meaning.

PREGUNTA 40
Subtitles should be semantically self-contained
- Verdadero
- Falso

PREGUNTA 41
The British National Corpus (BNC) contains 100 million words of English,
- 50% written, 50% spoken
- 90% written, 10% spoken
- 75% written, 25% spoken

PREGUNTA 42
The British National Corpus contains ...
- Only written texts

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SUBRAYA ESTO: CÓMETE UN TACO.


- Only spoken texts
- 90% written, 10% spoken
- 90% spoken, 10%written

PREGUNTA 43
The most common concordance format is the...
- KIWC concordance.
- KWIC concordance.
- KWCI concordance
- WICK concordance.

PREGUNTA 44
The number of tokens in the corpus is an estimate of overall
- Corpus size
- Frequency
- Vocabulary size

PREGUNTA 45
What are hapax legomena in a corpus?
- Words that appear only once in the corpus.
- The most frequent words in a corpus.
- The least frequent words in a corpus.
- Greek words in a corpus.

PREGUNTA 46
What are word clusters also called?
- Lexical bundle
- Both answers are correct
- N-gram

PREGUNTA 47
What does ICE stand for?
- International Corpus Exercise
- International Corpus of English
- Irish Corpus of English
- International Corpus of Europe

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 48
What does KWIC stand for?
- Key Word In Context
- Key Words In the Corpus
- Key Words In Corpora

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- Kernel Words In Context

PREGUNTA 49
What does POS tagging aim to identify?
- Word-category information somewhat independently of sentence structure
- Word-category information
- Word-category information depending of sentence structure

PREGUNTA 50
What does SGML mean?
- Standard and General Markup Language
- Standard Generalized Markup Language
- Standard Generalized Marked Language

PREGUNTA 51
What does XML stand for?
- Extensible Marking Language
- Extreme Mechanic Language
- Expandible Markup Language
- Extensible Markup Language

PREGUNTA 52
What does lower case "b" abbrevation stand for?
- Bits
- Bytes
- Both

PREGUNTA 53
What does the suffix "?" mean in Regular Expressions?
- It allows arbitrary repetition

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SÓLO UN APUNTE MÁS: CÓMETE UN TACO.


- It matches any character in the alphabet
- It marks the preceding character as optional

PREGUNTA 54

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
What is Corpus Linguistics?
- A methodology to study language in all its aspects.
- A separate branch of linguistics.
- A new theory of language.
- All of them are true.

PREGUNTA 55
What is Unicode?
- A worldwide word-encoding standard
- A worldwide character-encoding standard
- The universal code used by all computers

PREGUNTA 56
What is a tagger?
- Software which allows you to tag input text
- The result of running a tagging software
- Both are correct

PREGUNTA 57
What is a token?
- Any word in the corpus counting words that appear more than once
- Any word in the corpus
- All the individual, different words in the corpus

PREGUNTA 58
What is an example of a modern mega corpus?
- Helsinki Corpus
- BNC (British National Corpus)
- Corpus of Historical American English (COHA)

PREGUNTA 59

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SUBRAYA ESTO: CÓMETE UN TACO.


What is mutual information?
- The relation between subjects and complements.
- A way of measuring collocational strength
- A partial solution to the problem of retrieving constructions of variable form

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 60
What is semantic prosody?
- A form of meaning which is established through the proximity of a consistent
series of collocates
- All of them are correct
- The spreading of connotational colouring beyond single word boundaries

PREGUNTA 61
What is the TTR?
- The number of times a collocation comes up
- The number of types divided by the number of tokens
- The number of tokens divided by the number of types.

PREGUNTA 62
What is the aim of the Distribution in BNCWeb
- Discover correlations between various categories of annotation
- Obtain the normalized frequency of the search item.
- Obtain the relative frequency of appearance of the search item.

PREGUNTA 63
What is the difference between a type vs a token?
- Token is how many times a word appears vs. type which is the kind of
specific word that appears
- Token is which kind or specific word that appears vs. type which is how many
times that word appears
- Type is where in context, or sentence, a word appears vs. token which is how
many times that specific word appears in a context or sentence

PREGUNTA 64
What is the reason for using POS (parts of speech)?

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SUBRAYA ESTO: CÓMETE UN TACO.


- To distinguish homonyms
- To enable more general word searches
- Both are correct

PREGUNTA 65
What regular expression would you use to search for -ing verbs?
- ing\b
- .ing
- \w+ing
- Any of those would work.

PREGUNTA 66
What results do you expect for a query like po*l ?
- Any 3 character long word starting with po and ending with l
- Any word 4 characters long starting with po and ending with l
- Any word of any length that starts with po and ends wiht l

PREGUNTA 67
When subtitling, you should translate as literally as possible.
- Verdadero
- Falso

PREGUNTA 68
Where is the node word placed in a concordance view?
- On the right
- Central position
- On the left

PREGUNTA 69
Which of the following is not a part of a WebQuest
- Process
- Sources
- Conclusion
- Goal

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
PREGUNTA 70
Which of the following is not a type of subtitle with respect of language
- Intralingual subtitling
- Interlingual subtitling
- Bilingual Subtitling

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- Sign language subtitling

PREGUNTA 71
Which of the following is not part of Leech's seven maxims of annotation
- It should not be possible to remove the annotation from an annotated
corpus in order to revert to the raw corpus.
- The end user should be made aware that the corpus annotation is not infallible,
but simply a potentially useful tool.
- Annotation schemes should be based as far as possible on widely agreed and
theory-neutral principles.

PREGUNTA 72
Which of the following is not part of a Rubric?
- Indicators
- Performance Criteria
- Rating scale
- Scores

PREGUNTA 73
Which of the following observations about word frequencies is correct?
- There are many more words with extremely high frequency
- There are few words with extremely low frequency
- There are a few words with extremely high frequency

PREGUNTA 74
Which of the following options contain open-class categories only?
- Pronouns, conjuctions, verbs and adverbs
- Nouns, verbs, prepositions and determiners
- Nouns, verbs, adjectives and adverbs

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SÓLO UN APUNTE MÁS: CÓMETE UN TACO.


PREGUNTA 75
Which of the following statements is true:
- Sample is the target group of interest to the study.
- Population is the group with interest in carrying out the research.
- Population is smaller than the Sample.

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- Sample is a small, representative group of the population.

PREGUNTA 76
Which type of word corresponds to an open-class category
- Adverb
- Adjective
- Both are correct
- Both are incorrect

PREGUNTA 77
Which symbols would you use to indicate the beginning and the end of a line?
- * and &
- ^ and #
- + and ?
- ^ and $

PREGUNTA 78
Which statement about HTML is false
- Not used by web browsers
- Developed by the W3C Consortium
- Based on the SGML tagging principle

PREGUNTA 79
Which query to find the spelling variants icecream, ice cream, ice-cream is correct?
- (ice[-,]cream|ice cream)
- (icecream|ice cream|ice-cream)
- Both are correct.

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

SUBRAYA ESTO: CÓMETE UN TACO.


PREGUNTA 80
Which one of these is NOT a property of collocations?
- Textual proximity
- Modifiability
- Category restrictions

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
- Limited compositionality

PREGUNTA 81
Which of these statements is true?
- A good sample should capture the variability in a population
- Any sample of the language will be biased, including some things but not others.
- Both are correct.
- Neither are correct.

PREGUNTA 82
Which of these statements is false?
- Fiction often contains physical descriptions.
- Fiction usually contains proportions, amounts, and quantities.
- Academic prose is more often concerned with proportions, amounts, and
quantities.

PREGUNTA 83
Which of these is an extreme example of non-compositionality?
- White wine
- Good practice guidelines
- Kick the bucket

PREGUNTA 84
Which of these has the most up to date list of English corpora?
- SKETCH English
- HKCSE (Hong Kong Corpus of Spoken English)
- SCONE (Seville Corpus of Northern English)
- VARIENG (University of Helsinki Research Unit for Variation, Contacts
and change in English)

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

¿RELLENANDO APUNTES?: RELLÉNATE UN TACO.


PREGUNTA 85
Which of these corpora is balanced between spoken and written language?
- Monitor corpora
- Specific corpora
- Mixed corpora
- Synchronic corpora

PREGUNTA 86
Which of the following worldwide character encoding standards is capable of encoding
more types of different characters?
- UNICODE
- Extended ASCII
- ASCII

PREGUNTA 87
Which of the following statements is true?
- All aspects of language can be studied using a corpus.
- Only a few aspects of language can be studied using a corpus.
- Most aspects of language can be studied using a corpus

PREGUNTA 88
Which of the following statements is true?
- Mixed corpora consist of the balance between spoken and written language
in order to be more representative of language in general.
- Written language is more difficult to process than spoken language (no fillers,
hesitations, false starts, ungrammatical constructs, ...)
- Synchronic corpora appear in historical or comparative language studies.

a64b0469ff35958ef4ab887a898bd50bdfbbe91a-7100517

Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.

You might also like