Professional Documents
Culture Documents
Fall 2021-2022
Submitted by
Drashti Patel(19BCE0602)
Prince Kumar(19BCI0002)
B. Tech
in
Abstract
In the present century, we live in a fast-forward life where saving time is a key for all and
hence, we as humans have made use of modern technology in order to save more and more
time and also to perform tasks efficiently. In this regard, word predictor is a small contribution
that increases efficiency.
Word Predictor is used in messaging apps like WhatsApp, web search engines, word processors,
and commands like interpreters, among other things. The original goal of word prediction
software was to assist persons with physical limitations in increasing their typing speed and
reducing the number of keystrokes required to produce a certain word or sentence. As a result,
we created our own word prediction software based on the Trie data structure, which
significantly improves the user's productivity by at least 10%. Word completion, often
known as autocomplete, is a function that guesses the rest of a word as the user types it. In most
frequently used graphical user interfaces, users can accept a recommendation or prediction by
using the tab key, or the down arrow key to accept any additional words from the list of
predicted words by pressing the down arrow key.
Word predictors are useful in a variety of situations, including texting, search engines, and so
on. We utilized the Trie data structure to create our word prediction algorithm. Our application
employs a pre-stored collection of terms to forecast what words the user would think of, which is
quite helpful.
Keywords
Introduction
When autocomplete properly guesses the term a user intends to type after only a few letters have
been placed into a text input field, it speeds up human-computer interactions. It works best in
domains with a restricted number of available words (like command-line interpreters), when
specific words are significantly more prevalent (like in email addresses), or when creating
organized and predictable material (as in source code editors). Many autocomplete algorithms
that are based on Artificial intelligence and machine learning and are powered with natural
language processing learn new words after the user has written them a few times, and can suggest
alternatives based on the learned habits of the individual user, hence leading to quiet a
personalized predictor system for every user.
Autocomplete or word completion works in order that when the author writes the primary letter
or letters of a word, the program predicts one or more possible words as choices. If the word he
intends to –write is included within the list he can select it, for instance by using the amount keys.
If the user's desired word is not anticipated, the writer must insert the word's next letter. At this
stage, the word(s) chosen are changed such that the words offered to begin with the same letters
as the persons chosen. When the user's desired word appears, it is picked, and the word is also
added to the text.
Words most likely to follow the recently typed one are anticipated in another type of word
prediction based on recent word pairs utilized. Word prediction is based on language modelling,
which calculates the words that are most likely to appear within a given vocabulary. Basic word
prediction on AAC devices is typically combined with a regency model, in which words that the
AAC user uses more frequently are more likely to be anticipated. Word prediction software
frequently allows users to either submit their own words into word prediction dictionaries or to
"learn" words that have already been published.
Literature survey
To comprehend the implementation of this project, we must first comprehend the Trie data
structure that we used to build it.
The phrase "retrieval" inspired the term "Trie." It's a sorted, tree-based data structure for storing a
collection of strings. There are as many signs as there are characters in each location. In order to
search the dictionary for a certain term, it uses the prefix. Because we know that strings are made
up of letters ranging from 'a' to 'z,' each node of the trie must have a maximum of 26 points.
The digital tree, or prefix tree, is another name for Trie. The key with which a node is associated
is determined by its position in the Trie.
Now let us see some basic properties of the trie data structure:
• The root node of the trie is always a null node.
• The child nodes of each node are sorted alphabetically.
• As we told earlier, each node can have a maximum of 26 child nodes.
• All nodes apart from the root node can store one of the alphabets.
Trie has the same three basic operations as any other tree-based data structure: insertion of a node,
searching for a node, and deletion of a node. The action of locating a node is at the heart of our
project. However, learning about the insertion process is necessary in order to comprehend the
search operation. As a result, in this part, we witness both insertion and searching of a node.
The first step is to create a completely new node in the trie. It's critical to understand the following
points before we begin the implementation:
● Every letter of the input key (word) is inserted as a separate entity within the Trie_node.
Note that children point to the next following level of Trie nodes.
● The key character array acts as an index of the children node.
● If this node already encompasses a regard to the current letter, set the current node to it
referenced node. Otherwise, create a replacement node, set the letter to be adequate for the
current letter, and even start the current node with this new node.
● The character length determines the depth of the trie.
Insertion of a node into this data structure has two main components.
• map<character,TrieNode>; This part is used in order to establish parent child relationship.
• Boolean endOfWord It is used to indicate the end of the string when a set of
characters are to be added to the trie
Next let us see the searching of a node, which is very similar to the insert operation, the
implementation of the search operation is as shown below.
Now let us come to what actually happens when we give input to this trie-based autocomplete
program. Consider our dictionary file has "Hell", "Hello", "Help", "Helps", "Hellish” as the five
words in the words that have “He” as the start part of the string.
Now when we type he it would suggest all these words ,bur when we type helping as the word it
would terminate the search p as it is the last child node for the trie that has stored all these words.
Shown below is how the programs tries to predict the words. Here is a part of the trie that has
stored these words
Next let’s see how the word “Help” is searched
"Help" is processed by starting up from the root node and comparing each node as it moves down.
Here the traversal can be viewed as H->E->L->P as the search terminates due to the end of the
string.
In case the string “Helpi” is given as input the search terminates at “p” and the program would not
give any result and ask if we wished to all this word to the dictionary, and hence the user will have
the ability to add new words to the pre-existing dictionary.
Implementation
This is the javascript file “trie.js” which will help us search for the words using trie data
structure.
This file has the word dictionary that will be used to predict words using the trie.js program. This
dictionary had over 50,000 words.
Generating test cases using http://generatortestcase.herokuapp.com/
The snap below shows all the words predicted when the yser gives “app” as the input.
The user can click on the word they required
When user gives “bac” as input
Our previous model was a simple C++ code and had no specific graphical user interface, rather
used the output screen only as the mode of interaction between user and program.
This is the output when autocomplete was selected and “app” was given as the input.
And incase a word that is not present in the dictionary was typed the system asked if the user
wished to add the word to the dictionary.
Conclusion
Word Predictor have application in messaging application like what’s app, web search engines,
word processors, command like interpreters, etc. The original purpose of word prediction software
was to help people with physical disabilities increase their typing speed,[1] as well as to help them
decrease the number of keystrokes needed in order to complete a word or a sentence. Thus, on this
front, we developed our own program for word predictor using data structure trie which definitely
increases the efficiency of the user.
It is also more faster at predicting words than a hash table, which is another popular technique used
for word prediction, this is because we can insert and find strings in O(L) time where L represent
the length of a single word. This is obviously faster than Hash.
This is also faster than Hashing because of the ways it is implemented. We do not need to compute
any hash function. No collision handling is required (like we do in open addressing and separate
chaining) Another advantage of Trie is, we can easily print all words in alphabetical order which
is not easily possible with hashing.
The graph of comparison of time complexities clearly shows that trie is an apt data structure for
any word prediction or autocomplete system.
References
[1] Haque, S., Bansal, A., Wu, L. and McMillan, C., 2021, March. Action word prediction for
neural source code summarization. In 2021 IEEE International Conference on Software
Analysis, Evolution and Reengineering (SANER) (pp. 330-341). IEEE.
[2] Jordan, M., Neto, G.N.N., Brito Jr, A. and Nohama, P., 2020. Virtual keyboard with the
prediction of words for children with cerebral palsy. Computer methods and programs in
biomedicine, 192, p.105402.
[3] Siddiqui, M.F. and Hassan, M., 2018. Effective word prediction in urdu language using
stochastic model. Sukkur IBA Journal of Computing and Mathematical Sciences, 2(2), pp.38-46.
[4] Saboor, A., Benda, M., Gembler, F. and Volosyak, I., 2019, June. Word Prediction Support
Model for SSVEP-Based BCI Web Speller. In International Work-Conference on Artificial
Neural Networks (pp. 430-441). Springer, Cham.
[5] Weng, R., Huang, S., Zheng, Z., Dai, X. and Chen, J., 2017. Neural machine translation with
word predictions. arXiv preprint arXiv:1708.01771.
[6] Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beaufays, F., Augenstein, S., Eichner, H.,
Kiddon, C. and Ramage, R., 1811. Federated learning for mobile keyboard prediction (2018).
arXiv preprint arXiv:1811.03604.
[7] Ambulgekar, S., Malewadikar, S., Garande, R. and Joshi, B., 2021. Next Words Prediction
Using Recurrent NeuralNetworks. In ITM Web of Conferences (Vol. 40, p. 03034). EDP
Sciences.
[8] Prajapati, G.L. and Saha, R., 2019. REEDS: Relevance and enhanced entropy based
Dempster Shafer approach for next word prediction using language model. Journal of
Computational Science, 35, pp.1-11.
[9] van Alphen, P., Brouwer, S., Davids, N., Dijkstra, E. and Fikkert, P., 2021. Word
Recognition and Word Prediction in Preschoolers With (a Suspicion of) a Developmental
Language Disorder: Evidence From Eye Tracking. Journal of Speech, Language, and Hearing
Research, 64(6), pp.2005-2021.
[10] Goulart, H.X., Tosi, M.D., Gonçalves, D.S., Maia, R.F. and Wachs-Lopes, G.A., 2018.
Hybrid model for word prediction using naive bayes and latent information. arXiv preprint
arXiv:1803.00985.
Don`t Worry! This report is 100% safe & secure. It`s not available publically and it`s not accessible by search engines
(Google, Yahoo. Bing, etc)
Sentence
In the present century, we live in a fast-forward life where saving time is a key for all and hence, we as humans have
made use of modern technology in order to save more and more time and also to perform tasks efficiently. Word
Predictor is used in messaging apps like WhatsApp, web search engines, word processors, and commands like
interpreters, among other things. . Word completion, often known as autocomplete, is a function that guesses the rest of
a word as the user types it. In most frequently used graphical user interfaces, users can accept a recommendation or
prediction by using the tab key, or the down arrow key to accept any additional words from the list of predicted words by
pressing the down arrow key. Word predictors are useful in a variety of situations, including texting, search engines, and
so on. We utilized the Trie data structure to create our word prediction algorithm. Our application employs a pre-stored
collection of terms to forecast what words the user would think of, which is quite helpful. Introduction Autocomplete or
word completion works in order that when the author writes the primary letter or letters of a word, the program predicts
one or more possible words as choices. If the word he intends to –write is included within the list he can select it, for
instance by using the amount keys. If the user's desired word is not anticipated, the writer must insert the word's next
letter. At this stage, the word(s) chosen are changed such that the words offered to begin with the same letters as the
persons chosen. When the user's desired word appears, it is picked, and the word is also added into the text. Words
most likely to follow the recently typed one are anticipated in another type of word prediction based on recent word
pairs utilized. Word prediction is based on language modelling, which calculates the words that are most likely to appear
within a given vocabulary. Basic word prediction on AAC devices is typically combined with a regency model, in which
words that the AAC user uses more frequently are more likely to be anticipated. Word prediction software frequently
allows users to either submit their own words into word prediction dictionaries or to "learn" words that have already
been published. Authors and year Title Methodology merits 1. Sakib Haque∗ , Aakash Bansal∗ , Lingfei Wu† and Collin
McMillan∗ ∗Dept. of Computer Science, University of Notre Dame, Notre Dame, IN, USA 7 JAN 2021 Action Word
Prediction for Neural Source Code Summarization · Attendgru:- This is the only approach that does not explicitly include
the AST, so this baseline is not applicable to the challenge experiments. · ast-attendgru-This baseline represents
approaches that flatten the AST into a sequence, then use a seq2seq- like approach to create the summary. · ast-
attendgru-fc-They proposed an extension to code summarization tools that includes “file context”, which they define as
all the other functions in the same file as the function being summarized. · graph2seq-They use a graph neural network
(GNN) to model the AST. Their paper focuses on code generation, but suggest that it is possible to use the GNN encoder
for code summarization as well · code2seq-are pursuing path- based encoding solutions that, in short, randomly select
100- 200 paths in the AST and use an attention mechanism to attend to different paths for different words. We use their
approach as a representative of path-based solution. Action word prediction as a complementary problem to source
code summarization. We argue that action word prediction is a key component of code summarization, because 1) high
quality summaries tend to use an action word as the first word in the summary, and 2) the first word of a prediction
tends to have a high impact on the subsequent predictions from a model. A key point is that source code summarization
tools very often need to predict the action word anyway, so a special emphasis on this problem is justified due to that
word’s importance.
Report Link:
(Use this link to send report to https://www.check-plagiarism.com/plag-report/523233a7af052c0c285fafb718fa09b2f54721639064344
anyone)
Excluded URL: No
Unique: 93%
Matched: 7%
Match Urls:
0:
https://www.semanticscholar.org/paper/Recurrent-Networks-%3A-Learning-Algorithms-%E2%88%97-Doya/f7c185861999
97d8521a51047b636eedcc83be60
1: https://www.imdb.com/title/tt0640956/trivia
2: https://docs.microsoft.com/en-us/ef/ef6/fundamentals/configuring/config-file
3: https://dictionary.cambridge.org/dictionary/english/summarized
4: https://www.powerthesaurus.org/different/synonyms/
Keywords Density
Plagiarism Report
By check-plagiarism.com
Don`t Worry! This report is 100% safe & secure. It`s not available publically and it`s not accessible by search engines
(Google, Yahoo. Bing, etc)
Sentence
Neto, Alceu Brito Jr, Percy Nohama 17 February 2020 · Grammatical features of the word are loaded during typing time.
The probability of classes occupying the next position is set according to their sequence position. · However, because
the partial deletion of a word may occur, the grammatical features might be retrieved. This retrieval requires storing the
possible paths from that sequence position onwards. Both the quantity of click savings and therefore the accuracy within
the prediction of desired words to be entered in a very text were considered satisfactory. the whole system support ed
prediction technique is out there on the web, and users have unrestricted and free access thereto. The linguistic
limitation observed b y the employment of a corpus is solved by opening the ASCII text file.. 3. Muhammad Hassan∗Muh
ammad Saeed∗Ali Nawaz∗Kamr an Ahsan†Sehar Jabeen∗Farh an Ahmed Siddiqui∗Kha war Islam 2nd July 2018
Effective Word Prediction in Urdu Language UsingStochasti c Model · The methodology is clear and straight as the
intelligent words were built with the implementation of artificial intelligence, through which the user typed character and
a list of suggestions to complete the word is available for the user. · · The study ended up with an optimized version of
the algorithm that would fulfill the need of output with the best utilization and artificial intelligence by working A
pioneering research has Auto Sug-gestor and Predictor of word (Get Your Word Before Your Typing to Increase The
Typing Speed).(N- 1) logi chas been used for more Artificial Intelligence, different algorithms, and searching approaches
to increase the typing speed by providing better auto suggestion. Although the research and experiments did not
contain a huge amount of data, the behind the screen as well to make the intelligence even better for user of front-end
program to get experience that is more professional with the application. outcomes met state- of –the- art work. It has
been noticed that words dictionary, quality and quantity had significant impact on predicting the list of suggestion and
its precision. 4. Abdul Saboor , Mihaly Benda , Felix Gembler , and Ivan Volosyak 16 May 2019 Word Prediction Support
Model for SSVEP- Based BCI Web Speller · The computational linguistic model called n-gram was used for the word
prediction based on the text database. · The n-gram model consisted of the contiguous sequence of n items for a given
text. Thus, the n-gram is a set of co- occurrence of words within a given bracket, and they can be classified as uni-gram
(size 1), bi-gram (size 2), tri-gram (size 3), etc. · (2) where m is gram number (m = 1 for uni-gram, m = 2 for bi-gram ...
and so on), and w is equal to number of words in a sentence. · The n-gram model predicts xi, based on xi−(n−1),...,xi−1
(3) In terms of probability, we can represent it as P(xi | xi−(n−1),...,xi−1) In this paper the proposed word prediction
support model can significantly increase the ITR and accuracy of the SSVEP-based BCI web speller. The CSS based
animation can provide better stimulation and straightforward configuration options. During the execution of the speller,
the AJAX calls were placed to call the Java Servlet and Java Beans which provided the JSON format glossary from the
database. These calls proved to produce the in-time word prediction. Therefore, the online speller proposed during this
study will be extended for the net based email interface, or to write down a blog which require a user to type the text
within the net browser. Dai and Jiajun Chen 5 Aug 2017 should at least contain information of each word in the target
sentence. • Word Predictions for Decoder’s Hidden States Similar intuition is also applied for the decoder. Because the
hidden states of the decoder are responsible for the translation of target words, they should be able to predict the target
words as well. However, due to the large amount of parameters and relatively small training data set, the end-toend
learning of an NMT model may not be able to learn the best solution. We argue that at least part of the problem is
caused by the long error backpropagation pipeline of the recurrent structures in multiple time steps, which provides no
direct control of the information carried by the hidden states in both the encoder and decoder
Unique: 98%
Matched: 2%
Match Urls:
0:
https://medium.com/double-pointer/system-design-interview-autocomplete-type-ahead-system-for-a-search-box-1ac968f
9f121
1:
https://www.mckinsey.com/~/media/McKinsey/Industries/Public%20and%20Social%20Sector/Our%20Insights/The%20ag
e%20of%20analytics%20Competing%20in%20a%20data%20driven%20world/MGI-The-Age-of-Analytics-Full-report.pdf
Keywords Density
Plagiarism Report
By check-plagiarism.com
CSE3001 Software Engineering
REVIEW 1 DOCUMENT
WORD PREDICTION
USING TRIE DATA
STRUCTURE
PREPARED BY
In this busy world no one has time now. Technology is being developed every day to increase the
efficiency. In this front, word predictor is a small step which increases our efficiency multi fold times.
Word predictor has applications in various areas like texting, search engine etc. To develop our word
predictor program, we used the data structure Trie. Our program uses a stored file of words to predict the
words which the user may think of thus helping a lot.
other model because in this project we want to know need and requirement of user at each and
every step and want to make it more efficient and time saving and user-friendly.
2. INTRODUCTION:
Autocomplete speeds up human-computer interactions when it correctly predicts the word a user intends
to enter after only a few characters have been typed into a text input field. It works best in domains with a
limited number of possible words (such as in command line interpreters), when some words are much
more common (such as when addressing an email), or writing structured and predictable text (as in
source code editors). Many autocomplete algorithms learn new words after the user has written them a
few times, and can suggest alternatives based on the learned habits of the individual user.
Autocomplete or word completion works so that when the writer writes the first letter or letters of a word,
the program predicts one or more possible words as choices. If the word he intends to –write is included
in the list he can select it, for example by using the number keys. If the word that the user wants is not
predicted, the writer must enter the next letter of the word. At this time, the word choice(s) is altered so
that the words provided begin with the same letters as those that have been selected. When the word that
the user wants appears it is selected, and the word is inserted into the text.
In another form of word prediction, words most likely to follow the just written one are predicted,
based on recent word pairs used. Word prediction uses language modeling, where within a set
vocabulary the words are most likely to occur are calculated. Along with language modeling, basic
word prediction on AAC devices is often coupled with a regency model, where words that are used
more frequently by the AAC user are more likely to be predicted. Word prediction software often
also allows the user to enter their own words into the word prediction dictionaries either directly, or
by "learning" words that have been written.
for
WORD PREDICTION
USING TRIE DATA STRUCTURE
Version 1.0
Prepared by
Prince Kumar[19BCI0002]
code implementations and testing.
ii
Table of Contents
Table of Contents ..................................................................................................................1-2
Revision History ....................................................................................................................1-2
1. Introduction .....................................................................................................................1-3
1.1 Purpose.................................................................................................................................... 1-3
1.2 Document Conventions ........................................................................................................... 1-3
1.3 Intended Audience and Reading Suggestions ......................................................................... 1-3
1.4 Product Scope ............................................................................................................................ 2
1.5 References .................................................................................................................................. 2
2. Overall Description .........................................................................................................2-4
2.1 Product Perspective................................................................................................................. 2-4
2.2 Product Functions ...................................................................................................................... 2
2.3 User Classes and Characteristics................................................................................................ 3
2.4 Operating Environment .............................................................................................................. 3
2.5 Design and Implementation Constraints .................................................................................... 4
2.6 User Documentation .................................................................................................................. 4
2.7 Assumptions and Dependencies................................................................................................. 4
3. External Interface Requirements ..................................................................................... 6
3.1 User Interfaces ........................................................................................................................... 6
3.2 Hardware Interfaces ................................................................................................................... 7
3.3 Software Interfaces .................................................................................................................... 7
3.4 Communications Interfaces ....................................................................................................... 7
4. System Features ................................................................................................................. 7
4.1 System Feature 1 ........................................................................................................................ 7
4.2 System Feature 2 (and so on) ..................................................................................................... 7
5. Other Nonfunctional Requirements ................................................................................. 8
5.1 Performance Requirements ........................................................................................................ 8
5.2 Safety Requirements .................................................................................................................. 8
5.3 Security Requirements ............................................................................................................... 8
5.4 Software Quality Attributes ....................................................................................................... 8
5.5 Business Rules ........................................................................................................................... 9
6. Other Requirements .......................................................................................................... 9
Appendix A: Glossary.............................................................................................................. 9
Appendix B: Analysis Models ................................................................................................. 9
Appendix C: To Be Determined List .................................................................................... 10
Revision History
Name Date Reason For Changes Version
Word prediction 08-09-21 Na Version 1
using trie data
structure
1. Introduction
1.1 Purpose
This document specifies requirements of the software that predicts word based on letters typed
so far. So, we will be using these requirements to proceed with implementing our idea. We will
be knowing what all resources we will need to implement word prediction. For example, before
starting to work on project, we should have knowledge of a programming language. So that
become our pre-requisite along with many others.
While working on the project, we will need many resources which we can be ready with on
time. This will prevent last moment rush and risk of not reaching deadline on time. For
example, we will need a dictionary file all the time while doing the coding part. So, we can get
a dictionary file ready before starting with coding part. Even after the project is completed, we
have to take care all the new words are added to the dictionary file and hence we have to keep
updating it.
Word Prediction allows user to complete the word based on input they have given so far. They can
even add new word if it does not already exist.
Any individual can understand by reading the reference papers provided in section 1.5 and then
follow the order of this SRS document for better understanding of the project.
1.5 References
[1] Trie data structure
https://www.javatpoint.com/trie-data-
structure
[3] An improved Bayesian TRIE based model for SMS text normalization
https://www.researchgate.net/publication/343441896_An_improved_Bayesian_TRIE_based_model
_for_SMS_text_normalization
2. Overall Description
In another form of word prediction, words most likely to follow the just written one are predicted,
based on recent word pairs used. Word prediction uses language modeling, where within a set
vocabulary the words are most likely to occur are calculated. Along with language modeling, basic
word prediction on AAC devices is often coupled with a regency model, where words that are used
more frequently by the AAC user are more likely to be predicted. Word prediction software often
also allows the user to enter their own words into the word prediction dictionaries either directly,
or by "learning" words that have been written.
To understand the dynamic data structure tree used in developing the program.
To understand the data structure ‘trie’ being used in the program.
To construct a strong and efficient algorithm to develop the program which is editable and can be
later used as a module for bigger software mechanism.
To develop a real time program which is efficient and has a fast processing and also has an industrial
application.
Gaant chart:
Pert Chart:
WBS:
Timeline Chart:
a. User Interfaces
Users will be using it using any electronic devices such as laptops, mobiles, computers etc.
Input device of system will be used to give input to the software and when the user presses enter, the
software shows words, that matches letters entered so far, on the screen of device being used.
a. Hardware Interfaces
The laptop or mobile the user shall use will be the hardware interface required.
b. Software Interfaces
Our program uses an external dictionary file which will be accessed throughout the program.
In the file, words are stored in title case without any space between two words. Every capital
letter denotes beginning of new word and next one denotes end.
c. Communications Interfaces
Will be updated in future versions.
4. System Features
4.1 System Feature 1: our basic text predictor.
4.1.1 Description and Priority
Autocomplete or word completion works so that when the writer writes the first
letter or letters of a word, the program predicts one or more possible words as
choices. If the word he intends to –write is included in the list he can select it, for
example by using the number keys. If the word that the user wants is not predicted,
the writer must enter the next letter of the word. At this time, the word choice(s) is
altered so that the words provided begin with the same letters as those that have been
selected. When the word that the user wants appears it is selected, and the word is
inserted into the text.
4.1.2 Stimulus/Response Sequences
This section will be updated in future versions.
4.1.3 Functional Requirements
Our program is the main functional requirement where we will be using a
programming language, example C, C++, etc.
The system shall constantly access the dictionary file. User will give input in
the form of word (partial or full). Words matching the letters typed so far will be
displayed on the screen.
If there are no matching words, then system will ask if user wants to add the
word to dictionary.
After every word prediction cycle completes, system asks if user wants to add
more words. If yes, another cycle starts, else the system halts.
5. Other Nonfunctional Requirements
➢ The system shall not try to use the essential data of users for any justified or unjustified
reasons.
Reliability:
• The system cannot be relied upon completely but we have to try to attain maximum
reliability.
• Reliability will also be higher since we try to attain maximum accuracy.
• Maintain proper and updated dictionary files to improve reliability.
Accuracy
• The information provided in the dictionary files and by the user should be correct.
• Minimize the errors.
• All operations will be done correctly to increase the level of accuracy.
5.5Business Rules
This field is not necessary for this project.
6. Other Requirements
Appendix A: Glossary
Trie data structure: it an advanced version of the tree data structure.
A data flow diagram (DFD) maps out the flow of information for any process
or system. It uses defined symbols like rectangles, circles and arrows, plus
short text labels, to show data inputs, outputs, storage points and the routes
between each destination.
In our project, user enters string. If the string is substring of any word in our
dictionary file, the program sends list of those matching words. If not, then it asks
user if they want to add the word or not. But in either case, the user is asked if
they want to go for autocomplete again or quit.
5. SEQUENCE DIAGRAM
6. COLLABORATION DIAGRAM
A collaboration diagram, also known as a communication diagram, is an
illustration of the relationships and interactions among software objects in the
Unified Modelling Language (UML). These diagrams can be used to portray the
dynamic behaviour of a particular use case and define the role of each object.
In our project, there are so many instances where user decides what should be
done with objects. User enters strings, it is sent to dictionary file to check if there
are any matching strings. Again, list of matching strings is displayed. If it’s not
there, then user can add it to the file.
REVIEW 2
DOCUMENT
for
WORD PREDICTION
USING
TRIE DATA STRUCTURE
Prepared by
Prince Kumar[19BCI0002]
1|Page
Review 2
CONTENTS
1. INTRODUCTION ................................................................................................................................. 3
1.1. PURPOSE OF THIS DOCUMENT ............................................................................................................ 3
1.2. SCOPE OF THE DEVELOPMENT PROJECT ............................................................................................. 3
1.3. DEFINITIONS, ACRONYMS, AND ABBREVIATIONS ............................................................................... 3
1.4 REFERENCES ..................................................................................................................................... 3
1.5. OVERVIEW OF THE PROJECT .............................................................................................................. 3
2|Page
Review 2
1. Introduction
The objective of this software design specification (SDS) is to ensure that the final
outputted software product meets the requirements of the end customer, i.e., functions
as expected, is reliable, is easy to use, does not demand inordinate efforts to train staff
in its use, etc. Specifically, the software design specification is a description of the
software components and sub-systems to be provided as part of the product.
Word Predictor have application in messaging application like WhatsApp, web search
engines, word processors, command like interpreters etc. The original purpose of word
prediction software was to help people with physical disabilities increase their typing speed,
as well as to help them decrease the number of keystrokes needed in order to complete a
word or a sentence. Thus, in this front we developed our own program for word predictor
using data structure trie which definitely increases efficiency of the user by at least 10%.
Autocomplete, or word completion, is a feature in which an application predicts the rest of a
word a user is typing. In graphical user interfaces, users can typically press the tab key to
accept a suggestion or the down arrow key to accept one of several.
y= yes
n= no
str= string
dict= dictionary
PK=primary key
1.4 References
Not applicable
3|Page
Review 2
A data flow diagram (DFD) maps out the flow of information for any process
or system. It uses defined symbols like rectangles, circles and arrows, plus
short text labels, to show data inputs, outputs, storage points and the routes
between each destination.
In our project, user enters string. If the string is substring of any word in our
dictionary file, the program sends list of those matching words. If not, then it asks
user if they want to add the word or not. But in either case, the user is asked if
they want to go for autocomplete again or quit.
3. CLASS DIAGRAM
5. SEQUENCE DIAGRAM
6|Page
Review 2
6. COLLABORATION DIAGRAM
7|Page
Review 2
The user interface will be focused on simplicity throughout the system. The same
overall format will be used for all users to maintain consistency. Since the interface
will be simple, there is much need to adjust it for different users’ abilities.
Issues will be showing the available dictionary to client in an effective way. This
means figuring out the best way to present words etc. Another issue will be how to
show the added word information effectively. This can probably be done by
displaying a page with information from the database about a particular word in a
document style format.
8|Page
Review 2
Word Predictor uses trie to predict word based on what the user has entered so far. In
computer science, a trie, also called digital tree or prefix tree, is a kind of search
tree—an ordered tree data structure used to store a dynamic set or associative array
where the keys are usually strings. We are using a pre-defined dictionary file, which
anyone can access from GitHub. It is a must to implement the project. It is a .txt file
where all the words are already stored and we can access it to search words, compare
with string entered by user, add word to it, if there are some new words.
We are trying to keep the system simple. The idea is to have a lot of functionality, but not at the
expense of having a usable system. We are replacing the existing systems which do not work well
and are too complicated to be effective. We are focusing our efforts around creating a system that
does the important functions, well. In this project the motivation was to re-create all the things from
scratch in such a way that interface is simple, system is highly efficient and all the modules
complement each other
1. Start
2. Input choice (autocomplete or quit)
3. If autocomplete enter the keywords otherwise quit.
4. Enter the keywords
5. Auto-Complete algorithm predicts one or more possible words as
choice.
6. If the result found, display the result on the screen.
7. If the result not found, Add the word to the dictionary file.
8. stop
9|Page
Review 2
The lower level modules will mostly conduct database updates and modifications. The
other modules will mainly be reading data from the database. To do this, the modules
will connect to the database using a common connection file containing the details to
make the connection.
Information transfer between each of the pages/classes will mainly be the constant
information stored such as words. The pages themselves will be sending information
to themselves via POST and GET calls so there will not be data flow between nodes
necessarily
Trie Algorithm
Data N/P
10 | P a g e
Review 2
Dataset Component
Data N/P
11 | P a g e
Review 2
IMPLEMENTATION:
CODE:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <algorithm>
#include <string>
#include <cstring>
using namespace std;
class Node
{
public:
Node()
{
mContent = ' ';
mMarker = false;
}
~Node() {}
char content()
{
return mContent;
}
void setContent(char c)
{
mContent = c;
}
bool wordMarker()
{
return mMarker;
}
void setWordMarker()
{
mMarker = true;
}
Node *findChild(char c);
void appendChild(Node *child)
{
mChildren.push_back(child);
}
vector<Node *> children()
{
return mChildren;
}
private:
char mContent;
bool mMarker;
12 | P a g e
Review 2
private:
Node *root;
};
Trie::Trie()
{
root = new Node();
}
Trie::~Trie()
{
// Free memory
}
void Trie::addWord(string s)
{
Node *current = root;
if (s.length() == 0)
{
current->setWordMarker();
return;
}
for (int i = 0; i < s.length(); i++)
{
Node *child = current->findChild(s[i]);
if (child != NULL)
{
current = child;
}
13 | P a g e
Review 2
else
{
Node *tmp = new Node();
tmp->setContent(s[i]);
current->appendChild(tmp);
current = tmp;
}
if (i == s.length() - 1)
current->setWordMarker();
}
}
bool Trie::searchWord(string s)
{
Node *current = root;
while (current != NULL)
{
for (int i = 0; i < s.length(); i++)
{
Node *tmp = current->findChild(s[i]);
if (tmp == NULL)
return false;
current = tmp;
}
if (current->wordMarker())
return true;
else
return false;
}
return false;
}
bool Trie::autoComplete(std::string s, std::vector<string> &res)
{
Node *current = root;
for (int i = 0; i < s.length(); i++)
{
Node *tmp = current->findChild(s[i]);
if (tmp == NULL)
return false;
current = tmp;
}
char c[100];
strcpy(c, s.c_str());
bool loop = true;
parseTree(current, c, res, loop);
return true;
}
void Trie::parseTree(Node *current, char *s, std::vector<string> &res, bool
&loop)
{
char k[100] = {0};
14 | P a g e
Review 2
15 | P a g e
Review 2
{
cout << endl
<< endl;
cout << "Interactive mode,press " << endl;
cout << "1: Auto Complete Feature" << endl;
cout << "2: Quit" << endl
<< endl;
cin >> mode;
switch (mode)
{
case 1: //Auto complete
{
string s;
char addNew;
cin >> s;
transform(s.begin(), s.end(), s.begin(), ::tolower);
vector<string> autoCompleteList;
trie->autoComplete(s, autoCompleteList);
if (autoCompleteList.size() == 0)
{
cout << "No suggestions" << endl;
cout << "Want to add this to the dictionary?(y/n): ";
cin >> addNew;
if (addNew == 'y' || addNew == 'Y')
{
trie->addWord(s);
cout << "Word " << s << " added to the dictionary." << endl;
}
else
cout << "Word " << s << " was not added to the dictionary" << endl;
}
else
{
cout << "Autocomplete reply :" << endl;
for (int i = 0; i < autoCompleteList.size(); i++)
{
cout << "\t \t " << autoCompleteList[i] << endl;
}
}
}
continue;
case 2:
delete trie;
return 0;
default:
continue;
}
}
}
16 | P a g e
Review 2
SCREENSHOT OF IMPLEMENTATION
The second interface is the page for the word predicting program
When the user chooses option 1 in the 1st interface they will land on this page
The program asks for the string. And displays matching words after computing it
based on the algorithm with the help of the dictionary that already has many words
saved in it.
The image below shows the possible words that are generated by the program when
an input “app” is fed to the program as the input.
First the program asks if user wants to go for ‘1. Autocomplete’ or wants to ‘2. Quit’
the program.
If user chooses ‘1’, he’ll get this output:
17 | P a g e
Review 2
The third interface is the page for the adding a new word to the dictionary
If the input string does not have any matches in the dictionary file, then the program
lands on this particular interface and asks the user if they want to add the input string
to the dictionary.
The user is asked if they want to add the string into the dictionary file. If users press
‘y’, it asks for full word that’s to be added.
If user presses ‘n’, the program again asks if they want to ‘1. Autocomplete’ or ‘2.
Quit’ i.e. It will land back to the home page.
Shown below is the interface where the user wants or does not want to add the string
“awaqw” to the dictionary.
The program asks for the string. And displays matching words. If the input string does
not have any matches in the dictionary file, it shows this:
18 | P a g e
Review 2
The user is asked if they want to add the string into the dictionary file. If users will
press ‘y’, it asks for full word that’s to be added. If user presses ‘n’, the program again
asks if they want to ‘1. Autocomplete’ or ‘2. Quit’.
19 | P a g e
REVIEW 3
for
WORD PREDICTION
USING
TRIE DATA STRUCTURE
Prepared by
Drashti Patel[19BCE0602]
Devanshi Choudhary [19BCE2614]
Prince Kumar[19BCI0002]
CODE:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <algorithm>
#include <string>
#include <cstring> using
namespace std; class
Node
{ public:
Node()
{
mContent = ' ';
mMarker = false;
}
~Node() {}
char content()
{
return mContent;
}
void setContent(char c)
{
mContent = c;
}
bool wordMarker()
{
return mMarker;
}
void setWordMarker()
{
mMarker = true;
}
Node *findChild(char c);
void appendChild(Node *child)
{
mChildren.push_back(child);
}
vector<Node *> children()
{
return mChildren;
}
private:
char mContent;
bool mMarker;
vector<Node *> mChildren;
};
Node *Node::findChild(char c)
{
for (int i = 0; i < mChildren.size(); i++)
{
Node *tmp = mChildren.at(i);
if (tmp->content() == c)
{
return tmp;
}
}
return NULL;
} class Trie {
public:
Trie();
~Trie();
void addWord(string s);
bool searchWord(string s);
bool autoComplete(string s, vector<string> &);
void parseTree(Node *current, char *s, vector<string> &, bool &loop);
private:
Node *root;
};
Trie::Trie()
{
if (current->wordMarker())
return true; else
return false;
}
return false;
}
bool Trie::autoComplete(std::string s, std::vector<string> &res)
{
Node *current = root; for (int
i = 0; i < s.length(); i++)
{
Node *tmp = current->findChild(s[i]);
if (tmp == NULL) return false;
current = tmp;
} char c[100];
strcpy(c, s.c_str());
bool loop = true;
parseTree(current, c, res, loop);
return true;
}
void Trie::parseTree(Node *current, char *s, std::vector<string> &res,
bool &loop) { char k[100] = {0}; char a[2] = {0}; if (loop)
{
if (current != NULL)
{
if (current->wordMarker() == true)
{
res.push_back(s);
if (res.size() > 15)
loop = false;
}
vector<Node *> child = current->children();
for (int i = 0; i < child.size() && loop; i++)
{
strcpy(k, s);
a[0] = child[i]->content();
a[1] = '\0';
strcat(k, a); if
(loop)
parseTree(child[i], k, res, loop);
}
}
}
}
bool loadDictionary(Trie *trie, string filename)
{ ifstream words; ifstream
input;
words.open(filename.c_str());
if (!words.is_open())
{
cout << "Dictionary file Not Open" << endl;
return false;
}
while (!words.eof())
{ char
s[100];
words >> s;
trie->addWord(s);
} return
true; } int
main() {
system("color 1E");
This is the javascript file “trie.js” which will help us search for the words using trie data structures.
This fine has the word dictionary that will be used to predict words using the trie.js program.
This is the basic interface of our project
The snap below shows all the words predicted when the yser gives “app” as the input.
Selenium
Selenium test script can be written in programming languages like Java, C#,
Python, Ruby, PHP, Perl and JavaScript. Selenium offers record and playback
features with its browser add-on Selenium IDE. The powerful Selenium WebDriver
helps you create more complex and advanced automation scripts.
Test report:
Graph :