You are on page 1of 5

CHAPTER I INTRODUCTION

Information Retrieval (IR) transformed natural language into account when translating search queries. The search is both one of the most popular applications and an application with significant room for improvement [REF_001]. From the early work on natural language processing (NLP), they have concentrated most of the research on tokenization and normalisation of terms such as detection of phrases, stemming, lemmatization, and most of it was quite successful [REF_002]. The researchers believe that by providing a conceptual process and semantic analysis, this will improve the traditional way of the search. There are methods that can help maximize the research especially the processes of making it easier for example the statistical methods to analyse distributional patterns and using of the expert ontologies algorithm. To get the essential details of how this research can contribute to the text search under a document, there is a short discussion on the classification of searches: keyword and conceptual search [REF_003]. The keyword search uses exact string matching in searching that enables the query to match with the documents texts. Conceptual search uses a string search with NLP that provides results to the user that does not only match the strings in the query but also the strings that are related to it. The Researchers detailed in this paper the points of the study. First, the precision and recall of our search results using intelligent string search compared to that of the traditional search. And, the feasibility of using this algorithm to text-handling software applications as a great tool for searching strings.

1.1

Statement of the Problem This section presents problem statements that are the objectives of the

researchers to be answered by the study. It is intended that a intelligent string search algorithm be implemented in string searching features of text handling software applications instead of exact string matching algorithm alone. The algorithms to be used are an algorithm based from Lingo for intelligent string search and Boyer-Moore for exact string matching. The researchers look into adding natural language processing into string searching to improve the whole process and this is proven by answering the following problems to: 1.1.1 Determine the effectiveness of intelligent search approach processed as precision and recall compared to string matching search. 1.1.2 Resolve the efficiency of text intelligent search results based on the rates of amount of data processed at a given time. 1.1.3 Investigate the quality of results using user-defined measures between the intelligent concept search and string matching algorithm. 1.1.4 Determine the computer performance based on the CPU, memory and disk usage it obtains upon searching queries provided that by any document size or number of text.

1.2

Hypothesis The researchers arranged this hypothesis to provide an established support to the

research after the experimentation process. Here are the concepts to show the different data and their relationship is given below:

1.2.1

The use of intelligent string search has a significant effect on the relevance of string search result without compromising the performance of text handling software application.

1.2.2

The results of the intelligent string search, provided that by any document size or number of text, have no significant difference in the response time compared to the string matching search results.

1.2.3

The process affecting the computer performance of intelligent string search has no significant difference with the string matching search

1.3

Assumptions This section presents the assumptions that are fixed conditions before the start of

the study. These conditions are also true all throughout the study. The assumptions are the following: 1.3.1 1.3.2 1.3.3 Input is a string or a set of strings The output search results are highlighted string or set of strings The text handling software application simulation accepts text files.

1.4

Significance of the Study This section presents the importance of the paper to various groups of people. It

may be important according to their interests and the significance of the research lies in the improved relevance of search results. By applying the techniques of natural language processing (NLP), we can do this as a tool to provide us related search results. This study is dedicated as contribution to the different areas of study and utilized for the advancement of Computer Science.

4.1.

Mobile Device Users With small display screens, it would be challenging for users to search for

relevant strings in text handling software applications installed on mobile devices. This would improve string searching by presenting mobile users not only the exact match of the input but also the ones related to it. 4.2. Software Developers of Search Tools Only exact string matching algorithms are integrated into text handling software applications. If natural language processing is integrated, then the search tools developed by software developers are going to be capable of presenting through highlight, if not all, then most of the text within a text file needed by the users according to the input. 4.3. Researchers This paper may be used as part of future research on string searching algorithms and natural language processing. This may be referenced in the study of syntactic, semantic, concept, contextual and others related to searching and the application of them.

1.5

Scope of the Study The study discusses the feasibility of integrating concept search with NLP and

string searching. The advantages of utilizing it into text handling software applications may provide a new method in searches established by the difference of its processes. The traditional search processes is simply by searching directly based from the search query. Generally, users and researchers are more likely to look out for random, fuzzy words that

they are looking for, and often miss out the important details. This study will focus to lookout for the differences of string matching and intelligent string search. The number of relevant results will be based both on system analysis and user analysis. It provides mean to faster information retrieval process to obtain good results. Therefore, the results will be tested by precision and recall. Also, we will be maximizing the implementation of the natural language processing. Through information retrieval, natural language is used to tokenize, normalize or lemmatize the string query. Overall, the natural language processes is one important aspect of our study. String searching algorithm is an important class of string algorithm to provide rapid string results. One example of which is the famous Boyer-Moore string searching algorithm. The implementation of the algorithm pre-processes is a good support to produce faster results. By using patterns to recognize the text, same concept will be based to produce intelligent string search. Thus, adding a new algorithm that will produce concepts rather than keywords. Therefore, we will make use of the Lingo algorithm. The purpose is to combine common phrase discovery and latent semantic indexing technique to separate search results into meaningful groups. Moreover, the benefit of making categorical process for acquiring results can possess meaningful, concise, and accurate results [REF_004]. The implementation process of the lingo algorithm will complete the study. Any ideas and conclusions drawn out from this paper except the ones presented in the study are not classified as immediately valid and reliable. They are under scrutiny and may be subjected to future study or investigation to be proven as true and correct.

You might also like