You are on page 1of 2

Dialects are small enclaves of varieties of languages that identifies an individuals social background

or occupation. In the northern areas of Philippines in the Cordillera Administrative Region Kankana-
ey is a popular dialect used in the Philippine Language that had Evolve, grow, and has the tendency to
disappear. In 1990 census - 2003 there are approximately 240,000. One research claims that up to
3,045 languages around the world are in danger of dying out due to a lack of speakers.
Languages are slowly vanishing because no one is teaching them to new speakers. When individuals
begin to speak and teach a more dominant language this will prove difficult to have public records of
information about their languages, thus forgotten.

Natural Language Processing(NLP) is one strategy for addressing the problem. NLP is the us of artificial
Intelligence(AI) in machine learning for machine translation algorithm, with the use of syntaxes and
rule-based algorithms to create a set of language strings that are translated onto another language.
The NLP algorithm provides the advantage of machine translation, which successfully translates a
source language into a destination language automatically. This is the greatest option for an
automated translation process that is efficient and precise. using the algorithm will allow individuals
to understand one another better. Step 1, word tokenization learns how each sentences of the
language is broken down into words. A study conducted indicated that there are approximately 3412
words and 400 Kankanaey sentences having to establish its syntactic rules. Step 2, Stemming words
into normalized tagalog and kankaney words into its base form and root form. Step 3, Lemmatization
by producing the root word which has meaning based on the rules of kankanaey and tagalog
language. Step 5, Identifying Stop Words of the language that are filtered out before or after
processing natural language data for they are unimportant. Step 6, POS tags, there are 30,000 root
words of the Tagalog language and 3412 words for kankanaey, identifying their verbs and nouns.

The study focuses on designing an application in translating kankanaey to tagalog and tagalaog into
kankanaey, these are the common language of the Philippines and the common dialect in the
northern regions in the cordilleras.

REFERENCES
https://en.wikipedia.org/wiki/Kankanaey_language

https://www.ethnologue.com/guides/how-many-languages-endangered

https://aws.amazon.com/what-is/machine-translation/

https://insights.daffodilsw.com/blog/why-machine-translation-in-nlp-is-essential-for-international-
business#:~:text=Natural%20Language%20Processing%20(NLP)%20offers,translation%20process
%20without%20human%20intervention.

https://iopscience.iop.org/article/10.1088/1757-899X/803/1/012023/pdf#:~:text=This%20paper
%20presents%20a%20corpus,to%20establish%20its%20syntactic%20rules.

https://thecelebtimes.com/how-many-words-are-there-in-tagalog
The Natural Language Processing(NLP) algorithm

1. Lexical Analysis and Morphological


The first phase of NLP is the Lexical Analysis. This phase scans the source code as a stream of
characters and converts it into meaningful lexemes. It divides the whole text into paragraphs,
sentences, and words.

2. Syntactic Analysis (Parsing)

Syntactic Analysis is used to check grammar, word arrangements, and shows the relationship among
the words.

Example: Agra goes to the Poonam

In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected by
the Syntactic analyzer.
3. Semantic Analysis

Semantic analysis is concerned with the meaning representation. It mainly focuses on the literal
meaning of words, phrases, and sentences.

4. Discourse Integration

Discourse Integration depends upon the sentences that proceeds it and also invokes the meaning of
the sentences that follow it.

5. Pragmatic Analysis

Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended effect by applying a
set of rules that characterize cooperative dialogues.

Rule-based machine translation


which is a built-in linguistic rules and bilingual dictionaries for specific industries or topics. Rule-based
machine translation uses these dictionaries to translate specific content accurately.

Statistical machine translation


uses machine learning to translate text. The machine learning algorithms analyze large amounts of
human translations that already exist and look for statistical patterns.

Syntax-based machine translation


a sub-category of statistical machine translation. It uses grammatical rules to translate syntactic units.

You might also like