Special Problems Related To Online Dictionaries

6 Special Problems Related to Online Dictionaries
Since the year 2000, the Internet has increasingly consolidated itself as the domi-
nant platform for electronic dictionaries. This development was initiated several
years earlier with the introduction of computer and information technologies into
the field of lexicography, a process which De Schryver (2003) dates back to the late
1960s. In these more than four decades lexicography has undergone a true revolu-
tion in most of its core expressions, from the selection process, through the produc-
tion phase, to the publication and presentation of the final product to the users on
some kind of electronic platform. But in a certain sense this revolution has stopped
halfway. When a deep-rooted millenarian cultural practice like lexicography passes
from one technology to another, one would expect such a gigantic step to be more
than a mere change of production methods and medium; it would also be logical for
it to involve improvements in terms of quality, expressed in a more accurate and
personalized satisfying of user needs which has become possible thanks to the new
technologies.
As could be expected, the many variants of electronic dictionary have been ac-
companied by a considerable body of academic reflection. Among the large number
of studies published during the past few years can be mentioned Haß & Schmitz
(2010), Granger & Paquot (2010, 2012), Kosem & Kosem (2011), and Fuertes-Olivera &
Bergenholtz (2011a). This literature undoubtedly contains many relevant ideas, yet it
is disappointing to see how most authors focus on practical problems related to
natural language processing and, to a lesser degree, on use of the available tech-
niques to speed up access (Faster Horses), create fancy products (Stray Bullets), or
conduct new types of user research (log files, eye tracking, online questionnaires,
etc.). The new possibilities of combining higher lexicographical quality with lower
production time and costs in terms of human, financial and technological resources
allocated have been dealt with only peripherally. The same holds true for the more
accurate and personalised user service, which has unfortunately been discussed by
only a minority of authors, although these aspects are no less important to lexicog-
raphy than, for example, language processing.
As explained in section 5.4, advanced electronic dictionaries in general, and
online dictionaries in particular, should avoid information overload and guarantee
the quickest and easiest possible consultation process. This, among other things,
could be achieved by offering the users only the minimum amount of data required
to cover the needs related to each function (Model T Fords) or each individual con-
sultation (Rolls Royces). There is no single model of any of these two dictionary
types, but both of them are based on a series of principles (cf. section 2.3). These can
only be applied in practice if the lexicographers – or their programmers – make use
of special techniques, which mainly, but not exclusively, have been developed in
Brought to you by | provisional account

Unauthenticated
Download Date | 1/5/20 7:16 PM
92 | Special Problems Related to Online Dictionaries
the framework of information science and subsequently incorporated into lexicog-

raphy, cf. Bothma (2011).
In this chapter we will examine some of the relevant techniques. Before doing
so, however, it is important to stress that the latter can merely hold the promise of
more advanced lexicographical solutions; for this promise to become reality it
should be embedded in a theory capable of providing the necessary guidance and
visions. Technology should never be allowed to take command. From this perspec-
tive, the available techniques may be subdivided into two main groups. The first
consists of the techniques that can be used to reduce the amount of data offered to
the user in each consultation, while the second allows the user to access supple-
mentary and additional data when needed. Although they may seem mutually ex-
clusive and opposed to each other, these two types of technique are actually com-
plementary. Both are needed in order to take the necessary steps towards a higher
degree of user satisfaction.
6.1 Data filtering
The first type of technique to be discussed here is data filtering. With this technique,
only part of the data contained in the data base may be shown on the screen when
performing a concrete consultation. The purpose is to adapt the data more precisely
to the specific needs of the target user and avoid superfluous data (information
overload). Data filtering can be based on various criteria and may be more or less
advanced in terms of the individual user’s needs. In this respect, one relatively sim-
ple technique (from the point of view of the user) is to offer options for monofunc-
tional data access. In section 5.5, one way of doing this was illustrated with exam-
ples from the Diccionarios de Contabilidad, i.e. using buttons which allowed for
almost instantaneous access to exactly the types of data needed in relation to text
reception, production, translation, etc.
However, a lexicographical function is not only defined by the situation in
which the user needs may appear, but also by the relevant user characteristics (cf.
section 5.4). If users are to be offered data tailored to their specific needs, this de-
mands that, in addition to data on the specific situation, the system also receives
personal data characterising the user, such as data on mother tongue, foreign-
language proficiency level, and subject-field knowledge. In this respect, the South
African information scientist Theo J.D. Bothma writes:
User profiling can be accomplished through the user supplying the system with specific data,
by the system tracking user behaviour and thereby automatically constructing a profile of the
user or a combination of the two. User profiles can be transitory (e.g., applied to a single search
only) or persistent (e.g. be stored on the system and be used for future retrieval actions).
(Bothma 2011: 84)

Unauthenticated
Data filtering | 93
The easiest way of constructing a user profile is by means of interactive fill-in op-
tions where the users are guided step by step when they supply the relevant data to
the system. In accordance with the resulting user profile – and the specific user
situation – the pre-programmed system will then automatically calculate the types
and amount of lexicographical data required to fulfil the needs occurring for this
specific user type in the situation in question. These data will be the only ones pre-
sented on the screen in the concrete consultation. This might relate to definitions
and meaning discrimination in the user’s mother tongue, grammatical data in a
foreign language (e.g. inflection of Spanish verbs), or explications adapted to the
user’s level of subject-field knowledge, as already discussed and exemplified by
Bergenholtz & Kaufmann (1997).
While the user profile, as a rule, can be made once and for all and only needs to
be refined when the user’s relevant characteristics change (for instance, in the case
of students), the description of the situation has to be supplied to the system when
starting each new task. Although it should always be possible to “re-saddle” in the
middle of the process, in this way it is not necessary to go through all the time-
consuming steps for each consultation. Once a relevant characterization of the spe-
cific user has been submitted together with an indication of the user situation in
question, the system will then automatically select, filter and present the specific
data needed by the user.
It goes without saying that this “requires a very sophisticated system both in
terms of the variables that have to be made provision for and also in terms of the
database structure and the way in which data is characterised” (Bothma 2011: 84).
One of the requirements is that at a very early stage of a dictionary project the lexi-
cographers decide which specific data fields should be included in the lexicographi-
cal data base, and provide a detailed description of these to the programmer. These
fields should be planned and designed in such a way that each specific data type is
attached to its own field in the data base.
The techniques or methods mentioned above are common to both the Model T
Fords and the Rolls Royces. If further steps have to be taken towards individualising
the lexicographical tool, then various techniques are possible. A relatively simple
one has already been applied in the Accounting Dictionaries and the Diccionarios de
Contabilidad, where the possibility of searching for and gaining data access exclu-
sively to collocations and their translation equivalents is an undeniable step to-
wards a future lexicographical Rolls Royce (cf. section 5.5). Such data-type-
orientated forms of access might also be provided for other relevant types of data in
specific dictionaries.
Another more time-demanding technique (from the point of view of the user) is
article modelling. In this case, each individual user of an online dictionary will be
given the option to design his or her own master article in terms of the types of data
wanted and their arrangement on the screen. One way of doing this is by allowing
the user to include or exclude data types from a general list in order to meet his or

Unauthenticated
her personal needs in relation to a specific task, for instance, translation of special-
ised texts. Such options do already exist, for instance using a pop-up window, as
can be found in the Swedish Lexin, which is a general learner’s dictionary for immi-
grants living in Sweden (see Illustration 2). This option would be much more rele-
vant to some users of specialised dictionaries, for example, professional translators,
who in most cases are very conscious of their own individual needs relating to their
profession.
Illustration 2: Pop-up window from Lexin showing options allowing the users to design individual
master articles by including and excluding data types.
As with the other techniques discussed above, the design of a master article may be
accomplished when the user enters the online dictionary for the first time, but it can
also be done when they begin a specific activity, start a specific consultation, and
even when they are in the middle of a specific consultation. In this way, it is possi-
ble to re-saddle whenever necessary in order to individualize the final data presen-
tation.
Various scholars, partially based upon the experience with their own online dic-
tionaries, have commented that users cannot be expected to engage in the necessary
interactive process in advanced lexicographical tools (see, e.g. Trap-Jensen 2010).
Such comments probably reflect the situation in many cases. However, among the
users of specialised dictionaries there are at least some groups who are highly de-

Unauthenticated
The type and the individual | 95
pendent on these dictionaries in their daily work. For these users, higher quality
combined with reduced consultation time would be an offer they cannot refuse.
Translators of specialised texts undoubtedly belong to this group. They would find it
very much in their interests if they could save just a few seconds in each consulta-
tion. This tendency among translators to dedicate more time to their dictionaries
also seems to have been confirmed in an empirical study by Müller-Spitzer,
Koplenig & Töpel (2011).
In section 5.5, we saw how an experienced translator (Tomaszczyk 1989) made a
total of 691 consultations during the translation of a specialised book on diamonds.
If he could save, say, 30 seconds in each consultation after going through the inter-
active process, this would mean a total of 345 minutes, i.e. almost six hours during
the whole translation process. Many professional translators would not hesitate to
take the time to communicate the necessary data to the interactive system if they
were afforded this possibility. Moreover, other frequent users of specialised diction-
aries would probably do the same once they had become acquainted with the time-
saving options offered them by the dictionary.
Finally, the provision of “hidden data” which the users can access when needed
(cf. section 6.3) should also be mentioned here. The corresponding techniques allow
the presentation of less data on the screen “at first sight”. This is especially relevant
when users are working with small screens like tablets, mobile phones or other
hand-held devices.
6.2 The type and the individual
Both the lexicographical Model T Fords and the more advanced Rolls Royces have
left behind the static articles inherited from printed lexicography. These advanced
information tools provide dynamic solutions designed either to meet the types of
needs which a specific type of user may have in a specific type of situation or activity
(Model T Fords), or the individual needs of an individual user in each consultation
(Rolls Royces). This development raises the question of the relation between the
type and the individual.
The gradual introduction of the lexicographical Rolls Royces will eventually
reconcile lexicography with the fact that no type of user has ever made a type of
lexicographical consultation in order to access a type of data that may meet a type of
information need occurring in a type of social situation. The only thing happening
every day, hour and minute, is that an individual user with individual information
needs occurring in an individual situation decides to make an individual lexico-
graphical consultation in order to access the concrete and individual data that may
satisfy his or her individual needs. Although each user, user situation, user need,
data and consultation may be assigned to specific types, they are not in themselves
types but individual and concrete phenomena.

Unauthenticated
This evident fact poses some new and complex problems in the realm of theory,
since no theory in the true sense of the word can be built directly upon individual
phenomena that may differ from one another in many aspects. One example will be
sufficient to illustrate the complexity of this problem. Let us consider a hypothetical
and relatively simple online dictionary, containing 10,000 lemmata with 10 data
items attached to each lemma. If such a dictionary was designed for fully individual-
ized access and the user was given the possibility to choose the data types they
wanted to visualize, then 1,023 (210-1) possible data combinations could be displayed
on the screen for each lemma; this is a total of 10,230,000 combinations for the
whole dictionary, if we take into account all the possible individual needs of a hypo-
thetical group of users in relation to the data contained in the dictionary. Moreover,
the total number of possible combinations in the individual articles and the diction-
ary as such would rise to the astronomical figures of 4,037,913 (1!+2!+3!+4!+5!-
+6!+7!+8!+9!+10!) and 40,379,130,000 (ca. 40 billions), respectively, were the users
also allowed to arrange the data individually in the order required by each one.
It goes without saying that it is impossible to develop any theory, let alone write
any lexicographical instructions, based on such an enormous and unsystematic set
of data. Scientific and theoretical work presupposes an abstraction from some of the
less important characteristics of the individual phenomena, and the creation of
concepts, categories and types which include phenomena with certain common
characteristics deemed essential and relevant for the research field in question. At
the level of theory it is, therefore, a sine qua non to continue the typologization of
the phenomena observed and to work with the corresponding types of user, user
situation, user need, data and consultation. This conclusion does not only hold true
at the level of theory, but also at the level of dictionary design and production. No
lexicographer, however skilled, is capable of dealing with each and every one of the
infinite number of individual needs that an infinite number of individual users may
have in an infinite number of situations. This is completely unthinkable and cannot
be the vision for future lexicography.
All this means that, just like theory is a precondition for a scientific practice,
theory-based typologization is also a precondition for a well-conceived individuali-
zation of dictionaries. Lexicographers – in order to do both their practical and theo-
retical job well – will therefore still have to work with types of users, situations,
needs, data, etc. When planning a new online dictionary, they still have to specify
the exact data types and corresponding fields to be included in a data base and at
the same time decide which techniques to be used in order to guarantee quality and
the required individualisation. What is needed in this respect is full use of the avail-
able information science techniques that permit individualized access to the data
prepared by the lexicographers and stored in a well-structured database, or made
available on the Internet by means of links, as well as access to data recreated on
the basis of already existing data.

Unauthenticated
Access to supplementary data | 97
6.3 Access to supplementary data
Advanced specialised online dictionaries should, above all, be characterised by

flexibility. Concrete user needs are not rigid categories, and lexicographical e-tools
should reflect this reality. The principles of quick and easy access, on the one hand,
and default presentation of as little data as possible, on the other, will inevitably
lead occasionally to situations where supplementary and additional data is required
to meet a specific user’s needs. There may be various reasons for this problem and,
hence, various types of possible solution, such as:
– The lexicographer has, on purpose, chosen to present less data than required in
order to avoid an overloaded screen. The best solution here is to make allow-
ance for data expansion by means of adaptive presentation or, alternatively,
linking to internal or external sources.
– The user has provided wrong information about his or her personal characteris-
tics, for instance, exaggerating his or her knowledge and skills, and less data
than required have therefore been presented. The best solution here is to re-
saddle in terms of user characteristics or, alternatively, to offer the same options
as above.
– The amount and types of data are the right ones but the user needs a more per-
sonalised approach. The best solution here is to use annotations.
– The user simply discovers in the middle of the consultation process that he or
she has supplementary or additional needs, for instance the need for supple-
mentary text examples or for specialised background knowledge when translat-
ing specialised texts. In such cases the best solution would often be to link to in-
ternal or external sources or even to reuse the data from these sources.
We will now briefly discuss some of the techniques or methods mentioned above:
– Adaptive presentation. There are various techniques in terms of adaptive hy-
permedia but here we will mention only pop-up windows activated by either
mouse-moving or clicking, cf. Prinsloo et al. (2012). Very often the lexicographer
decides that not all data needed in a specific consultation should be presented
at first sight. There could be various reasons for this; for instance, to prevent the
user from having to scroll down so as to see all data (this depends on the size of
the screen), or simply because the predicted type of user cannot be expected to
overview too many data at the same time. Whatever the reason, the lexicogra-
pher can choose to add some “hidden” supplementary data, which can easily be
accessed, for example, by moving the mouse over a specific word or area, or
clicking on a link. Then a pop-up window will appear with supplementary text,
tables, illustrations, photos, etc. It could, let us say, be a table with the inflected
forms of a Spanish verb, which would occupy a large amount of screen space.
Alternatively, it might be background information related to a specific term or
phenomenon, etc. The content of the pop-up window could be specifically pre-

Unauthenticated
pared for this purpose, or it might consist of already-existing data taken from
another dictionary article. Adaptive presentation is especially relevant when
users are working with small screens like tablets, mobile phones or other hand-
held devices, cf. Kwary (2013).
– Indexing and abstracts. One of the on-going discussions among scholars in
terms of e-lexicography is how much data should be presented at a given time to
the dictionary user (see, e.g. Lew 2013). If the idea is to provide quick and easy
data access, then the answer seems to be that the maximum amount of data is
that which can be displayed on the screen without the necessity to scroll down.
However, in many dictionaries, such as those with cognitive functions like The
new Palgrave Dictionary of Economics, this represents an ideal which is impos-
sible to achieve in each and every article displayed. One solution could, there-
fore, be the presentation of some sort of preliminary article index which can be
further expanded by means of hypermedia (like in Wiktionary). Alternatively, a
small abstract could be placed at the top of the article, as has been done in The
New Palgrave Dictionary of Economics. The application of these and similar
techniques immediately provides the user with an overview of the content. In
addition, the possibility of expanding an index by means of hypermedia allows
the user to go directly to the section of the article which is pertinent in each
case.
– Annotation. In the Web 2.0 environment, users are allowed to add their own
data and comments to already-existing documents without changing the origi-
nal in any way, cf. Bothma (2011: 96). Applying this technique, users may be-
come mini-lexicographers and “co-authors” of their own dictionary, for in-
stance, when recommendation of certain terms (in case of synonymy) or
addition of other terms and data is relevant to a particular company or branch.
This technique can without any doubt contribute to the future individualization
of lexicographical works. Moreover, although it can also be applied by individ-
ual users in order to personalize specific articles, it may above all be useful in
the framework of predefined user groups such as companies, branches, public
entities, research groups, etc.
– Reuse of data through linking. As stated above, an online dictionary may provide
either static or dynamic articles in order to present the data stored in the con-
nected database. However, in both cases, the dictionary contains a limited
amount of data that may not be sufficient or accurate enough to fulfil the user’s
information requirements in a specific consultation. In such cases, it could be
advantageous to provide a link to additional data stored in external sources
such as corpora and the Internet. The data in question could be additional col-
locations, example sentences showing the usage of collocations and other lin-
guistic features, or so-called contextual definitions found in existing texts. In
this respect, Rundell (2010) proposes:

Unauthenticated
Access to supplementary data | 99
Online dictionaries could link directly to general and specialized corpora, allowing users to
search for examples of any word, pattern, or linguistic feature they are interested in. (Rundell
2010: 174).
In the same vein, Theo Bothma observes, from the point of view of information sci-
ence:
Much work has been done on the role of corpora in dictionaries; see, for example, Prinsloo
2009 and the opinion regarding the value of such corpora differ widely. Lexicographers should,
however, consider exploring the vast stores of linked open data to provide access to additional
information to satisfy especially cognitive needs of users. Currently e-dictionaries provide links
only to so-called outer texts and to manually selected external examples; examples of both
principles can be found in a number of the dictionaries by Ordbogen.com, as well as in many
other free online dictionaries. This, however, does not make systematic and/or automatic use
of reusable data and requires a tremendous input in terms of time and effort from the lexicog-
rapher. One example of a dictionary that does provide the option to link to a huge external da-
tabase of examples is the Base lexicale du français (BLF). After getting examples of the meaning
of a word the user has the option of linking to various corpora, including a set of documents of
the European Parliament and Wikipedia [...] These examples are automatically searched by the
BLF and the selection of the examples does not require any input from the lexicographer (ex-
cept, obviously, specifying which corpora are to be searched). (Bothma 2011: 92)
Heid, Prinsloo & Bothma (2012) have dedicated an entire article to the integration of
dictionaries and corpus data in web portals, and have put forward a number of in-
teresting proposals in this regard. According to them:
Electronic dictionaries should supply a natural bridge for the user from the dictionary article to
a variety of internet resources to enhance access to potentially relevant information. In total,
the user should be able to find a wealth of digestible information without being overloaded.
(Heid, Prinsloo & Bothma 2012: 270)
When an individual experiences an information need, he or she may then access the
data contained in a dictionary and thereby retrieve the required information. This is
at least what occurred in lexicography until only a few years ago, with lexicographi-
cal works providing only direct access to lexicographically selected, elaborated and
prepared data, and not to that prepared and made accessible elsewhere, for exam-
ple, in books and archives. It is only recently that a few advanced lexicographical
tools have tried to reuse already-existing data made available through a database or
the Internet, and one of the visions today is not only to reuse such information but
also to repackage it by adapting it to the specific information needs of the user in
each situation, cf. Bothma (2011) and Tarp (2011).

Unauthenticated
6.4 The use of corpora and the Internet
In the previous section it was argued that the reuse and repackage of data imported
from external sources like corpora and, above all, the Internet, is bound to play an
important role in future dictionaries in terms of individualization of needs satisfac-
tion. Corpora and the Internet are already extensively used by lexicographers in
order to select (mostly corpora) and validate (both) data during the compilation of
dictionaries. However, what is at the stake here is the automatic incorporation of
data into online dictionary articles and, in a far-reaching perspective, the automatic
generation of such articles, that is, without the intervention of the human factor in
the selection and validation process.
During the past few years, various tools have been developed with a view to ap-
proaching this ideal. Some of them may make the job easier for lexicographers en-
gaged in the production of various types of dictionaries, but they are still far from
convincing if the vision is to completely supersede the lexicographer, i.e. the human
factor. One such example is a bilingual term recognition system discussed by Vintar
(2010). By means of this tool various English terms with their Slovene equivalents
have been selected within the fields of tourism, accounting and the military. Unfor-
tunately, Vintar does not reveal how many terms have been detected within each
field, and it is therefore not possible to see to what extent terms are actually cov-
ered. However, at least one of the 10 “recognized” English accounting terms – dis-
putable receivable – given as an example by Vintar (2010: 150) is not an accounting
term but a collocation (the accounting term is receivable). This provision of a non-
existing term may not seem so important. Although it is not the case here, within
accounting a “small” mistake of this sort could easily become a major one which
may eventually lead to loss of millions of euros and compensation claims against
the uncritical dictionary user, e.g. a translator. This is why it is recommended that
all terms “recognized” are validated by a subject-field expert before becoming avail-
able to the end users (cf. section 5.7).
In the modern world, especially with the advent of the Internet, there are a lot of
sources by which individuals may satisfy their information needs. In this respect, it
should never be forgotten that what distinguishes – or ought to distinguish – dic-
tionaries and other lexicographical works is not only that they are, par excellence,
consultation tools that provide quick and easy access to the data from which the
required information can be retrieved, but also that such data are supposed to be
reliable and trustworthy. Consequently, as long as artificial intelligence has yet to be
developed up to a level at which it can supersede human intelligence – and this may
never happen – the intervention of a lexicographer or subject-field expert, at least in
terms of validating the data automatically selected or recognized, will continue to
be a necessary precondition for reliability and trustworthiness. In section 5.4, we saw
a negative example of what could happen when a non-expert (Kilgarriff 2012) tried

Unauthenticated
Online communication with the user | 101
to take in definitions by googling; it is easy to imagine the disastrous consequences

should a non-human system be asked to do the same.
Does this rather pessimistic vision invalidate the idea that lexicographical Rolls
Royces should automatically recreate and re-present data taken in from corpora and
the Internet in order to provide more dynamic and individualized solutions to the
users? Not necessarily, but it puts it into perspective and gives rise to four principles
or recommendations that could contribute to guaranteeing the quality of this ad-
vanced tool:
– The present-day or near-future vision should not be a completely automatic, but
rather a semi-automatic generation of dictionary articles, where central parts of
the articles are compiled or validated by lexicographers or subject-field experts,
and where the users in need of additional data are given the option to import
these data from external sources.
– Non-validated data should only be taken in and presented to individual users
whose critical sense, subject-field knowledge, and lexicographical culture place
them in a position to validate the respective data in a reasonable manner.
– The user should, in one way or another, be clearly informed about the non-
validated data and the sources from which these data have been imported. In
this respect, it should be remembered that data taken from a well-composed
and controlled corpus may have greater degree of reliability than data taken
from the Internet.
– The user should be given the option to determine the amount of additional data
in order to meet his or her individual need and avoid information overload.
With this in mind, the question remains as to which kind of data could or should be
imported and re-presented in online dictionaries? In principle, such data might be
any that could be addressed to the lemma or to already selected information, such
as:
– Collocations and other word combinations showing the syntactic properties of
the lemma in question.
– Example sentences illustrating the use of “any word, pattern, or linguistic fea-
ture”, including collocations and idioms.
– Contextual definitions which could enrich the definition already included.
– Background texts in order to meet additional cognitive needs related to the task
performed.
6.5 Online communication with the user
In the printed environment, communication between lexicographer and user was

mainly one-way, where sporadic user feedback could only be taken into considera-
tion in subsequent editions which frequently took several years to publish, if they

Unauthenticated
were published at all. By contrast, in the online environment traditional communi-

cation has increasingly been replaced by the possibility of having two-way commu-
nication, where users can comment almost instantaneously on existing data, sug-
gest modifications, or ask for additional data in order that their needs are met in a
more complete way. This, of course, also modifies the traditional role of lexicogra-
phers, whose job no longer ends with the publication of a given dictionary; today
they should be prepared to perform new tasks with regard to post-publication user
service and continuous updating. In order to guarantee this partial incorporation of
users into the production and perfection of the dictionary, it is important that the
dictionary interface should in some way allow users to communicate directly with
lexicographers, thereby benefiting the quality of the work and increasing its adap-
tion to their needs.
Apart from this open and direct communication, it is also extremely helpful for
lexicographers if the underlying system allows them to receive continuous, indirect
information about user behaviour, for instance about frustrated “look-ups” for spe-
cific terms or collocations. This type of information may not only assist the lexicog-
raphers in the perfection of the already finished product but may also interfere di-
rectly in their work during the phase where the dictionary is still under construc-
tion. The studying of log files reflecting user behaviour in some online dictionaries
has shown that only a certain percentage of lemmata are ever consulted by the us-
ers. For example, only a third of the approximately 110,000 lemmata in Den danske
netordbog (Danish Internet Dictionary) have ever been searched for, despite the fact
that these have been available online for several years and that over 18 million con-
sultations have been made, cf. Bergenholtz & Norddahl (2012: 220). The non-
consulted words may be old or new, long or short, difficult or easy, and belong to all
parts of speech. So far no systematic explanation has been found for this strange
phenomenon, which also seems to be repeated – with other figures – in specialised
online dictionaries produced by the Aarhus-based Centre for Lexicography.
If the above tendency were confirmed in other projects, editorial policy would
have to be adapted correspondingly. Such a policy could, among other things, in-
clude the online publication of a new dictionary when, for example, only 30 per cent
of all its expected lemmata, the “core” ones, have been completed. The remaining
lemmata could then be incorporated gradually according to the pre-established
plan, while the still unpublished terms and words subject to frustrated consulta-
tions – and relevant to the subject field and functions of the dictionary – could be
finished and included as soon as possible after the frustrated consultation, e.g. in a
period of 24 hours. The introduction of such working methods could benefit all par-
ties: users could see their needs fulfilled more quickly and more accurately than
before; publishing houses could start selling their products much sooner than oth-
erwise; and lexicographers could receive feedback and assistance to improve the
quality of their work and adapt it more successfully to the specific needs of their
target users.

Unauthenticated
Shortening the access route | 103
6.6 Shortening the access route
As has already been mentioned in previous chapters, the time factor in terms of a
quick overall consultation process should also be regarded as an important criterion
for quality in specialised online dictionaries. Saving time is important for many
users of these works, especially those who have to consult dictionaries as part of
their work. We have already discussed how a user-friendly article design with as few
data as possible may contribute to reducing the time required to find the specific
data and retrieve the information sought. But also the time spent from the moment
the consultation starts until when the data appear on the screen is relevant here. If
this can be reduced, various seconds can be saved in each consultation.
In order to reduce additionally this part of the overall consultation time, various
techniques have been developed in recent years. These techniques have, among
other things, been embodied in applications connected to an e-dictionary. Once
such an application has been downloaded and installed on the computer, a user
who is reading, writing, translating or revising electronic texts has simply to point to
the problematic word or word combination in order to get assistance. The applica-
tion will then – in a pop-up window – provide a definition, a translation equivalent,
the correct spelling or other relevant grammatical data. The content shown in the
window can be adapted to the user’s specific activity, for instance, text production
or translation. When this activity has been indicated, the user need only click on the
required data in the pop-up window, for instance, data concerning an inflected
form, and this will immediately be reproduced in the text with which they are work-
ing, thereby saving the time required to re-write it in the text.
Bothma & Prinsloo (2013) have analysed one such application used to assist text
reception in Kindle e-books. The two South African researchers show how this ap-
plication can still be refined, and propose a number of solutions in this regards. If a
user works with this or a similar application, then he or she does not any longer
have to activate the online dictionary, write the relevant word or word combination
in the search field and wait for the required data to appear on the screen. All these
intermediate phases can now be dispensed with.
It goes without saying that the technique can only be used in connection with
electronic texts. When it is used in activities related to these, it represents a consid-
erable reduction in overall lexicographical consultation time. For some categories of
users, this will also mean a measurable reduction of work costs. Although the tech-
nique still has to be refined, as pointed out by Bothma & Prinsloo (2013), there is
little doubt that it will gain a footing in online lexicography in the next few years.
Designers of specialised online dictionaries should, therefore, seriously consider
incorporating this technique and enabling applications such as those mentioned to
be downloaded and used in connection with their products.

Unauthenticated

Special Problems Related To Online Dictionaries

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Special Problems Related To Online Dictionaries

Uploaded by

Copyright:

Available Formats

6 Special Problems Related to Online Dictionaries

Brought to you by | provisional account

the framework of information science and subsequently incorporated into lexicog-

6.1 Data filtering

Brought to you by | provisional account

Brought to you by | provisional account

Brought to you by | provisional account

6.2 The type and the individual

Brought to you by | provisional account

Brought to you by | provisional account

6.3 Access to supplementary data

Advanced specialised online dictionaries should, above all, be characterised by

Brought to you by | provisional account

Brought to you by | provisional account

Brought to you by | provisional account

6.4 The use of corpora and the Internet

Brought to you by | provisional account

to take in definitions by googling; it is easy to imagine the disastrous consequences

6.5 Online communication with the user

In the printed environment, communication between lexicographer and user was

Brought to you by | provisional account

were published at all. By contrast, in the online environment traditional communi-

Brought to you by | provisional account

6.6 Shortening the access route

Brought to you by | provisional account

You might also like