You are on page 1of 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/350124767

Multilingual Facilitation

Book · March 2021


DOI: 10.31885/9789515150257

CITATIONS READS

0 23

3 authors:

Mika Hämäläinen Niko Partanen


University of Helsinki University of Helsinki
62 PUBLICATIONS   217 CITATIONS    31 PUBLICATIONS   58 CITATIONS   

SEE PROFILE SEE PROFILE

Khalid Alnajjar
University of Helsinki
29 PUBLICATIONS   63 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Kola Saami Documentation Project View project

AKU Open Language Technology for Uralic Languages View project

All content following this page was uploaded by Mika Hämäläinen on 17 March 2021.

The user has requested enhancement of the downloaded file.


Multilingual Facilitation

Honoring the career


of Jack Rueter

Mika Hämäläinen,
Niko Partanen and
Khalid Alnajjar (eds.)
Multilingual Facilitation
This book has been authored for Jack Rueter in
honor of his 60th birthday.

Mika Hämäläinen, Niko Partanen and Khalid Alnajjar (eds.)


All papers accepted to appear in this book have undergone a rigorous
peer review to ensure high scientific quality. The call for papers has
been open to anyone interested. We have accepted submissions in any
language that Jack Rueter speaks.

Hämäläinen, M., Partanen N., & Alnajjar K. (eds.) (2021) Multilingual


Facilitation. University of Helsinki Library.

ISBN (print) 979-871-33-6227-0 (Independently published)


ISBN (electronic) 978-951-51-5025-7 (University of Helsinki Library)

The contents of this book have been published under the CC BY 4.0
license1.

1
https://creativecommons.org/licenses/by/4.0/
Tabula Gratulatoria

Jack Rueter has been in an important figure in our academic lives


and we would like to congratulate him on his 60th birthday.

Mika Hämäläinen, University of Helsinki


Niko Partanen, University of Helsinki
Khalid Alnajjar, University of Helsinki
Alexandra Kellner, Valtioneuvoston kanslia
Anssi Yli-Jyrä, University of Helsinki
Cornelius Hasselblatt
Elena Skribnik, LMU München
Eric & Joel Rueter
Heidi Jauhiainen, University of Helsinki
Helene Sterr
Henry Ivan Rueter
Irma Reijonen, Kansalliskirjasto
Janne Saarikivi, Helsingin yliopisto
Jeremy Bradley, University of Vienna
Jörg Tiedemann, University of Helsinki
Joshua Wilbur, Tartu Ülikool
Juha Kuokkala, Helsingin yliopisto
Jukka Mettovaara, Oulun yliopisto
Jussi-Pekka Hakkarainen, Kansalliskirjasto
Jussi Ylikoski, University of Oulu
Kaisla Kaheinen, Helsingin yliopisto
Karina Lukin, University of Helsinki
Larry Rueter
LI Līvõd institūt
Lotta Jalava, Kotimaisten kielten keskus
Mans Hulden, University of Colorado
Marcus & Jackie James
Mari Siiroinen, Helsingin yliopisto
Marja Lappalainen, M. A. Castrénin seura
Markus Juutinen, Oulun yliopisto
Mary Elizabeth Rueter
Merja Salo, University of Helsinki
Michael Rießler, University of Eastern Finland
Miikka Silfverberg, University of British Columbia
Nina Sokolova
Olga Erina
Olga Potapova
Päivi Rainò (née Pimiä), Humak University of Applied Sciences
Pasi Lankinen ja Vuokko Pajala, Mordvan matkaajat
Paul & Jenny Rueter
Paula Kokkonen
Riho Grünthal, University of Helsinki
Rogier Blokland, Uppsala University
Sampsa Holopainen, University of Helsinki
Santeri Junttila, Gripswooldin yliopisto ja Helsingin yliopisto
Sidney da Silva Facundes, Universidade Federal do Pará
Sirkka Saarinen, Turun yliopisto
Sjur Nørstebø Moshagen, UiT The Arctic University of Norway
Suomalais-Ugrilainen Seura
Susanna Virtanen, Helsingin yliopisto
Sven-Erik Soosaar, Eesti Keele Instituut
Tanja Säily, University of Helsinki
Tapio Mäkeläinen ja Renate Blumberga
Timo Rantakaulio, Helsingin yliopisto
Tom & Toshiko Rueter
Tommi A Pirinen, UiT Norgga árktalaš universitehta
Tommi Jantunen, Sign Language Centre, University of Jyväskylä
Tommi Jauhiainen, University of Helsinki
Trond Trosterud, Giellatekno, Institutt for språk og kultur, UiT Noregs
arktiske universitet
William Yoro Rueter
Галина Пунегова, Коми туялан шӧринысь Кыв, литература да
история институт, Сыктывкар
Галина Рябова
Дмитрий Цыганкин
Иван Рябов
Йӧлгинь Цыпанов, Коми туялан шӧринысь Кыв, литература да
важвылӧм институт, Сыктывкар
Марина Федина
Нина Агафонова

Each and every one of us has contributed financially to printing this


book and has received a personal printed copy. The money left over
after covering the printing costs has been given to Jack Rueter as a
birthday gift.
A Letter from the Editors
Jack Rueter’s lifelong journey of describing Uralic languages and
developing computational tools for their analysis is a unique one.
Over the course of his career, he has made incredibly important con-
tributions to both Finno-Ugric studies and computational linguistics.
Although he has devoted most of his attention to the Erzya, Moksha,
Komi and Skolt Sami languages, he has also been involved in many
ways in the work on other Uralic languages. His work covers, at least
to some degree, almost all languages in the Uralic family. In the re-
cent years, he has expanded his scope even further, to the Arawakan
language Apurinã and other languages of the Amazon region. For
many, this could seem like a sudden shift in focus, but in the context
of Dr. Rueter’s career, we can see it as a continuation of his long-
term interest in the languages of the Americas, which also connects
to his earlier work on the Lushootseed language spoken in the re-
gions close to in his place of birth.

Dr. Rueter is one of the most altruistic and selfless people we have
ever met. If one looks only at the list of his academic publications,
one will reach an utterly inaccurate conclusion on the importance of
his work. He has never been interested in writing academic publica-
tions for egotistical reasons to advance his own career. Instead, he
has dedicated his career to building resources that are far more valu-
able than any purely academic research. His extensive work is the
reason that several endangered languages have extensive high-quality
digital dictionaries and morphological tools. These resources have a
direct impact on people’s lives; without Dr. Rueter, tools we speakers
of majority language take for granted, such as spell checkers and
keyboards with predictive text, would be a distant dream for several
endangered Uralic languages. The idea of doing research for its own
sake does not fit well into Dr. Rueter’s agenda. He does research in
order to understand, in order to model and, most importantly, in order
to share the results with the community. In the field of linguistics and
natural language processing, he truly is a pioneer in open-source sci-
ence.
Dr. Rueter arrived in Finland for the first time in July of 1975, when
he was 14 years old. Soon after, at 17, he returned to live for one year
in Järvenpää as an exchange student. In 1982, he started his academic
studies in the University of Helsinki, and he has stayed on this path
ever since then. Dr. Rueter defended his dissertation at the University
of Helsinki in 2010. Some important parts of his career have also in-
cluded teaching in Mordovia in 1997–2004. In addition, he has al-
ways maintained close relations with researchers and institutions in
other areas where Uralic languages are spoken, including Estonia,
Latvia, Norway and the Komi Republic. Over the years he has also
forged new academic relationships, most recently in Brazil. It has
been a delight for us to see so many regions of the world represented
in this book.

Dr. Rueter has made significant contributions to several areas of lin-


guistic research. The most well-known, or best associated with Dr.
Rueter, are Finite-State Transducers. Their importance for research
on Uralic languages is also visible in many contributions in this book,
where they play a significant role. The transducers, known as FSTs,
essentially form a grammatical description of a language in a pro-
grammatic format. The quality and coverage of the descriptions Dr.
Rueter has created are on par with any modern grammar book and
should be equally highly valued.

The FSTs are connected to many other endeavors in Dr. Rueter’s ca-
reer, one of the most important being lexicography. Dr. Rueter
played a seminal role in the creation of Mordwinisches Wörterbuch,
a monumental series of lexica in the Erzya and Moksha languages. In
this work, Dr. Rueter’s contribution was of a more technical nature,
but in the mid-90s, his independent work as a lexicographer started to
bear fruit when his Komi dictionary, Ӧшка-мӧшка ичӧт кыввор
комиа-англискӧя-финскӧя, ‘Rainbow vocabulary Komi-English-
Finnish', saw light and began to circulate. Though never officially
published, it has been a very important tool for many students and re-
searchers of the Komi language. Later, Dr. Rueter’s dictionary work
continued and evolved into online lexica in the GiellaLT infrastruc-
ture, ultimately rendering them openly available. In recent years, this
work has developed into a larger online infrastructure with particular
emphasis on how to empower language communities and nurture
their participation in the dictionary editing process.

One of the most recent new directions Dr. Rueter’s research has tak-
en comes from the Universal Dependencies project. This project in-
cludes systematically annotated materials in different languages of
the world. Dr. Rueter is among the most prolific contributors in the
project, at least when it comes to the number of treebanks he has con-
tributed to or initiated from scratch. Dr. Rueter’s nuanced under-
standing of morphosyntax and different grammatical processes has
allowed him to curate numerous accurate and substantial materials
that will without a doubt stand the test of time.

Dr. Rueter is not one to make a show of himself, and instead of


boasting about his accomplishments, he prefers to focus on the task at
hand. This task of description, documentation and refinement, it
seems, appears to be never ending, and it is not unusual to see Dr.
Rueter contributing to collaborative projects at any moment of the
day or night. What is this task, exactly, a bystander might wonder?

We who participate in scientific research with Dr. Rueter have come


to understand that what drives his work is not any specific task, but a
vision. It is evident from his endeavors that he sees the interconnect-
edness of different levels of linguistic research, and, above all, under-
stands how a computational infrastructure can be created to bring
these levels together. This infrastructure is not ready yet, and nor is
Dr. Rueter’s work, but if it is ever realized, it will be in many ways
because of his lifelong contribution, which we have come together
here to celebrate.

To the editors of this work, Jack has always been an insightful men-
tor, but also a colleague and a close friend.
Jack Rueter’s Selected Work

Rueter, J., & Hämäläinen, M. (2020). FST Morphology for the Endan-
gered Skolt Sami Language. Proceedings of the 1st Joint SLTU and
CCURL Workshop (SLTU-CCURL 2020) (pp. 250–257).

Rueter, J., Partanen, N., & Ponomareva, L. (2020). On the questions in


developing computational infrastructure for Komi-Permyak. In Proceed-
ings of the Sixth International Workshop on Computational Linguistics of
Uralic Languages (pp. 15–25).

Rueter, J., Hämäläinen, M., & Partanen, N. (2020). Open-Source Mor-


phology for Endangered Mordvinic Languages. In Proceedings of Second
Workshop for NLP Open Source Software (NLP-OSS) (pp. 94–100).

Rueter, J., & Hämäläinen, M. (2020). Prerequisites for Shallow-Transfer


Machine Translation of Mordvin Languages: Language Documentation
With A Purpose. In Материалы Международного образовательного
салона (pp. 18-29).

Rueter, J., & Hämäläinen, M. (2020). Skolt Sami, the makings of a pluri-
centric language, where does it stand? In European Pluricentric Lan-
guages in Contact and Conflict (pp. 201-208). Peter Lang.

Rueter, J. (2020). Корпус национальных мордовских языков:


принципы разработки и перспективы функционирования/ действия.
In Финно-Угорские Народы в Контексте Формирования
Общероссийской Гражданской Идентичности и Меняющейся
Окружающей Среды (pp. 118-127).

Rueter, J., & Partanen, N. (2019). On New Text Corpora for Minority
Languages on the Helsinki korp.csc.fi Server. 32–36. In Электронная
письменность народов Российской Федерации: опыт, проблемы и
перспективы,

Rueter, J., & Hämäläinen, M. (2019). On XML-MediaWiki Resources,


Endangered Languages and TEI Compatibility, Multilingual Dictionaries
For Endangered Languages. In AsiaLex 2019: Proceedings of the 13th
Conference of the Asian Association for Lexicography
Rueter, J., & Partanen, N. (2019). Survey of Uralic Universal Dependen-
cies development. In Third Workshop on Universal Dependencies (UDW,
SyntaxFest 2019)

Rueter, J. M., & Tyers, F. M. (2018). Towards an open-source universal-


dependency treebank for Erzya. In International Workshop for Computa-
tional Linguistics of Uralic Languages,

Rueter, J., & Hämäläinen, M. (2017). Synchronized Mediawiki based ana-


lyzer dictionary development. In 3rd International Workshop for Compu-
tational Linguistics of Uralic Languages (IWCLUL 2017): (pp. 1–7).

Rueter, J. M. (2016). Towards a systematic characterization of dialect var-


iation in the Erzya-speaking world: Isoglosses and their reflexes attested
in and around the Dubyonki Raion. In Mordvin languages in the field (pp.
109-148)

Rueter, J. M. (2015). On Linguistic Distance between Erzya and Moksha,


Morphology. In 90-летию профессора Д. В. Цыганкина

Rueter, J. (2014). The Livonian-Estonian-Latvian Dictionary as a thresh-


old to the era of language technological applications. Journal of Estonian
and Finno-Ugric Linguistics, 5(1), 251–259.

Rueter, J. (2013). Quantification in Erzya. In Typology of Quantification:


On quantification in Finnish and languages spoken in the Volga–Kama
Region (pp. 99–122).

Rueter, J. (2013). The Erzya Language, Where is it spoken? Études finno-


ougriennes, 45

Rueter, J. (2011). The status of the non-finite OmstO morpheme in Erzya.


Linguistica Uralica, XLVII (2011)(1), 41–55.

Rueter, J. (2010). Adnominal Person in the Morphological System of


Erzya. Mémoires de la Société Finno-Ougrienne 261.

Рютер, Джек М. (2000) Хельсинкиса университетын кыв туялысь


Ижкарын перымса симпозиум вылын лыддьöмтор. Пермистика. 6,
Проблемы синхронии и диахронии пермских языков и диалектов :
сборник статей, 154–158.
Rueter, J. (1994). Ӧшка-мӧшка ичӧт кыввор комиа-англискӧя-
финскӧя. (Manuscript with printed version available).
Table of Contents

Endangered Languages are not Low- Mika Hämäläinen 1-11


Resourced!

The many writing systems of Mansi: Jeremy Bradley & 12-24


challenges in transcription and Elena Skribnik
transliteration

Corpona – The Pythonic Way of Khalid Alnajjar & 25-30


Processing Corpora Mika Hämäläinen

Number Expression in Apurinã (Arawák) Sidney Da Silva 31-42


Facundes & Marília
Fernanda Pereira de
Freitas & Bruna
Fernanda Soares de
Lima-Padovani

Серпаса коми гижöдъясын шуанног да Галина Пунегова 43-52


сiйöc петкöдланног

Building language technology Tommi A Pirinen & 53-60


infrastructures to support a collaborative Francis M. Tyers
approach to language resource building

Is There Any Hope for Developing Tommi Jantunen & 61-73


Automated Translation Technology for Rebekah Rousi &
Sign Languages? Päivi Rainò &
Markku Turunen &
Mohammad Moeen
Valipoor & Narciso
García

Towards Extracting Formulaic Kenichi Iwatsuki 74-82


Expressions from Japanese Scholarly
Papers

The Principal Parts of Finnish Nominals Mans Hulden & 83-93


Miikka Silfverberg

Питирим Сорокинлысь «A long Йöлгинь Цыпанов 94-103


Journey» небöг комиöдöмын
шыбöльяс
Saamelaiskielten indefiniittipronominien Markus Juutinen & 104-127
jäljillä Jukka Mettovaara

Электронный языковой корпус как Мария З. Левина 128-132


фактор сохранения мордовских
(мокшанского и эрзянского) языков

Lexd: A Finite-State Lexicon Compiler Daniel Swanson & 133-146


for Non-Suffixational Morphologies Nick Howell

Muutaman runon ja sävellyksen jäljillä – Paula Kokkonen 147-152


tekijöitä etsien

From Plenipotentiary to Puddingless: Tanja Säily & Eetu 153-169


Users and Uses of New Words in Early Mäkelä & Mika
English Letters Hämäläinen

Time’s arrow reversed? The Terttu Nevalainen 170-176


(a)symmetry of language change

Nykysuomen sanakirjan digitaalinen Niko Partanen & 177-186


editio Lotta Jalava

Kantasaamen *-(e̮)hče̮- Eino Koponen & 187-196


frekventatiivijohtimen edustuksesta Juha Kuokkala
nykyisissä saamelaiskielissä

Soft on errors? The correcting Trond Trosterud & 197-207


mechanism of a Skolt Sami speller Sjur Moshagen

This is thy brother’s voice – Rogier Blokland & 208-227


Documentary and metadocumentary Niko Partanen &
linguistic work with a folklore recording Michael Rießler
from the Nenets-Komi contact area

Suomalais-ugrilaiset kielet ja internet - Tommi Jauhiainen & 228-247


projekti 2013-2019 Heidi Jauhiainen &
Krister Lindén

The Development of a Comprehensive Jörg Tiedemann 248-262


Data Set for Systematic Studies of
Machine Translation

Ульяновской Областень Нина А. Агафонова 263-274


Новомалыклинской Райононь Эрзянь & Иван Н. Рябов
Велень Кортавкстнэсэ Азорксчинь
Невтиця Суффикстнэнь Башка
Ёнксост

When Word Embeddings Become Khalid Alnajjar 275-288


Endangered

Onko uralilaisen etnohistorian Janne Saarikivi 289-298


tutkimuksessa tapahtunut käänne?
Monitieteisyys ja uudet teoriat
itämerensuomalaisten kielten synnystä
Valter Langin Homo Fennicuksen
valossa

View publication stats

You might also like