P. 1
Urbanizing the Rural Agriculture - Knowledge Dissemination using Natural Language Processing

Urbanizing the Rural Agriculture - Knowledge Dissemination using Natural Language Processing

|Views: 31|Likes:
Published by ijcsis
The Indian rural agriculture has been facing a lot of problems. There are problems like irrigation problems, unfavorable weather conditions, and lack of knowledge regarding the market prices, animals, tools & pest prevention methods. Hence there is a need to develop a method to enable our rural farmers to gain knowledge. Knowledge can be gained by communicating to the experts of various fields in the agricultural sector. Therefore, we aim to provide the farmers with an interactive kiosk panel, using which they can get an easy and timely solution to their queries within 24 hours without being troubled to travel to distant places or make long-distance calls to gain information. Hence we focus towards development of software, which would provide immediate aid to the farmers in every possible manner. It would be an interactive system providing an end-to-end connectivity to the farmers with the international agricultural experts, which can help them in solving their queries and thereby enhancing the sources of information to the farmers.
The Indian rural agriculture has been facing a lot of problems. There are problems like irrigation problems, unfavorable weather conditions, and lack of knowledge regarding the market prices, animals, tools & pest prevention methods. Hence there is a need to develop a method to enable our rural farmers to gain knowledge. Knowledge can be gained by communicating to the experts of various fields in the agricultural sector. Therefore, we aim to provide the farmers with an interactive kiosk panel, using which they can get an easy and timely solution to their queries within 24 hours without being troubled to travel to distant places or make long-distance calls to gain information. Hence we focus towards development of software, which would provide immediate aid to the farmers in every possible manner. It would be an interactive system providing an end-to-end connectivity to the farmers with the international agricultural experts, which can help them in solving their queries and thereby enhancing the sources of information to the farmers.

More info:

Published by: ijcsis on Jun 30, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

08/04/2010

pdf

text

original

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No.

1, April 2010

URBANIZING THE RURAL AGRICULTURE - KNOWLEDGE DISSEMINATION USING NATURAL LANGUAGE PROCESSING
Priyanka Vij (Author)
Student, Computer Science Engg. Lingaya‟s Institute of Mgt. & Tech, Faridabad, Haryana, India priyankavij6@gmail.com Harsh Chaudhary (Author) Student, Computer Science Engg. Lingaya‟s Institute of Mgt. & Tech, Faridabad, Haryana, India harsh_hps@yahoo.co.in

Priyatosh Kashyap (Author)
Student, Computer Science Engg. Lingaya‟s Institute of Mgt. & Tech, Faridabad, Haryana, India priyatoshkashyap@gmail.com

ABSTRACT - The Indian rural agriculture has been facing a lot of problems. There are problems like irrigation problems, unfavorable weather conditions, lack of knowledge regarding the market prices, animals, tools & pest prevention methods. Hence there is a need to develop a method to enable our rural farmers to gain knowledge. Knowledge can be gained by communicating to the experts of various fields in the agricultural sector. Therefore, we aim to provide the farmers with an interactive kiosk panel, using which they can get an easy and timely solution to their queries within 24 hours without being troubled to travel to distant places or make long-distance calls to gain information. Hence we focus towards development of software, which would provide immediate aid to the farmers in every possible manner. It would be an interactive system providing an end-toend connectivity to the farmers with the international agricultural experts, which can help them in solving their queries and thereby enhancing the sources of information to the farmers. Keyword: Rural Agriculture; Farmers; Natural Language Processing; Speech recognition; Language translation; Speech synthesis;

make this communication interactive we make use of an upcoming technology “Natural Language Processing”. A. Natural Language Processing It‟s used to communicate with the computer in our natural language. By using it we believe to make it an interactive End to End communication using Voice, where Voice of Sender in his language is converted to a Voice in Receiver‟s Language. This Entire Process of Voice to Voice Transformation, may be divided into 3 phases:-

Figure 1. Process of Natural Language Processing

I.

INTRODUCTION

India is an agro-based country with its major sector being the rural region. One of the major source of livelihood in India is agriculture. Current agricultural practices are neither economically nor environmentally sustainable and India's yields for many agricultural commodities are low. Sources responsible for this are unpredictable climate, growth of weeds, lack of knowledge about land reforms and market prices, decrease in profit margin, lack of technology, proper machinery, instant trouble shooting, knowledge of agricultural advancements, improper communication etc.. In order to overcome these problems the farmers should be made aware of the current trends in the field of agriculture, so that the entire agricultural system can be upgraded to solve and overcome the bottleneck problems in the agricultural growth. As we know Communication is the main backbone to solve any problem irrespective of any field it belongs to. So we have used communication as an integral part of our project. To

1) Speech recognition: Converting the spoken words to machine-readable input. It includes the conversion of continuous sound waves into discrete words.[1] 2) Language translation: It‟s translation of one natural language into another. 3) Speech synthesis: It‟s the artificial production of human speech. A text-to-speech system converts normal language text into speech.[2] B. Background and Related Work A lot of work has been done in the field of the agriculture extension to provide the farmers with ready to use knowledge. Many methods for the same have been implemented in India and other countries too. 1) aAQUA - Almost All Questions Answered (aAQUA) is web based query answering system that helps farmers with their agricultural problems. The technology for aAQUA is a multilingual (Marathi, Hindi, and English) system which provides online answers to questions asked over the internet.[3]

163

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010

(IJCSIS) International Journal of Computer Science and Information Security, Vol. XXX, No. XXX, 2010

Shortcoming: it‟s a web portal, but majority of the rural population, in India, is not literate, In order to run this system a person has to be available always to register the farmer‟s query which is not always very feasible. If a person speaks in his regional language (e.g. Hindi) his query will only reach the experts who understand that language (i.e. Hindi), 2) E-Choupal :- The Internet enable the farmers to obtain information on mandi prices, good farming practices and place orders for agricultural inputs like seeds and fertilizers.[4] Each kiosk has an access to Internet. Shortcoming: Deals only about the Market facilities and not Troubleshooting 3) Gyandoot:– It‟s an intranet network of computers connecting the rural areas and fulfilling the everyday, information related needs of the rural people.[5] It made use of information and communication technology to provide online services Shortcoming: It‟s only implemented as yet in an intranet, amongst a small district. 4) Agriculture Information System:– It uses Agri Portal, Mobile Agriculture, Kisan Help Desk, Agri Learning, Agri GIS, Integration.[6] It broadcasts Information through Mobile Phones– Voice or SMS. Shortcoming: This one uses voice interaction but is only through Mobile phones C. Overcome The Shortcomings We implement our system in an Information Kiosk, which helps us rectify the above shortcomings. The farmers get in hand information regarding the crops, their prices in the market and every information they wish to know about within 24 hour, without having to go to various places in search of proper tools and techniques which are being used. We Understand that a Farmer may or may not be literate, Hence We make limited use of language and emphasize on the pictorial representation of Data. The farmers can themselves record their queries using the microphone. Hence the need for a helper to operate the system for them is nullified. Moreover since Natural language processing technique has been used for the conversion of Hindi to English and English being the most common, hence it allows free interaction between a rural farmer and the international experts, thereby connecting Indian farmers, not only with

the experts who understand their language but also with experts from other countries. We not only Deal With the Marketing Facilities, but generalize it to be a communication related to any field causing a problem to the farmers. As we say that we connect the Farmers with the International experts it means we pose „No Restriction / Barrier‟ such as an Intranet, rather we use Internet Making Help lines available, so that farmers can call and ask, but this is totally dependent on a Mobile‟s Network as well as a monetary aspect in connecting a call. Hence we remove it as we use Internet & connect the two ends totally with the limited cost of a dedicated internet Line.

II.

ARCHITECTURAL ISSUES AND CHALLENGES

The various challenges to be faced by us in building such a system are: Providing information to the rural, computer-illiterate population via the kiosks was a big challenge. To make the illiterate people comfortable with our system, we designed a user interface which could pictorially depict at a glance, what that particular section is about.(eg) Fig.2 points to the field “WEATHER” Figure 2. Weather Icon Proper connectivity, as our kiosks require 24 hour internet access. Due to this reason, it is very essential to have dedicated internet connection. Even in the case of resource constraint i.e. connectivity issues etc, a proper backup must be taken into consideration. Creating an voice recognition system, which translates the voice into text, This is a challenge as such systems require a high amount of training for accurate recognition. The major concern is the difference in regional accent of people while speaking a language (such as Hindi). Translating a language to another, there might be certain grammatical errors that might also lead to a complete change of meaning in the sentence and in that case the reciever might understand something completely different than what was actually meant. Synthesizing speech out of the text, there may be some problems such as the computer is not able to say a word properly, and the audio which we get might be a rhyming word of the actual word to be said, hence it would change the entire meaning of that sentence. Adding intelligence in the system, to make this system give response immediately, it is required that a database is created where the pre-answered queries along with their solutions will be stored and when the similar question would be asked, depending on the Pattern Matching the system will retrieve the most appropriate solution. But,

164

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010

(IJCSIS) International Journal of Computer Science and Information Security, Vol. XXX, No. XXX, 2010

sometimes the keyword may match but the sense of the question might be completely different. Providing better service response time, we need to undergo considerable process reengineering exercise so that the response we claim to come as soon as the expert answers, max 24hrs, may be lowered to some minutes, which can only be possible if we are able to add intelligence to the system.

he had posted. The answered text is stored in the database next to the question, which would be used in phase 3. Phase 3: Farmers Getting Instant Responses if Query Already Answered If a Farmer asks a query which was already answered by some other farmer then the Answer to that Particular query is posted back to the Farmer Instantly. This process of Intelligent Information Retrieval is carried out by analyzing the keywords present in the query and then searching it on the database whether they match accurately with any of the pre-answered query If a perfect match occurs which is in context to the farmer‟s query, the answer is instantly given to the farmer else the process of phase1 would be carried out. We discuss the main Techniques, with the Help of which our project transforms the Input Voice to the Audio output at the same kiosk where the farmer asks the question: A. Speech Recognition Converting the spoken words to machinereadable input is speech recognition. It includes the conversion of continuous sound waves into discrete words. Speech Recognition fundamentally functions as a pipeline that converts PCM (Pulse Code Modulation) digital audio from a sound card into recognized speech. [7]
Figure 4. Speech Recognition Process

III.

METHODOLOGY

It‟s the sequence of routines which we adopt for the development of the system, It‟s development is divided into 3 Phases:-

Figure 3. Flow diagram of the entire System, showing the 3 phases

Phase 1: Farmers Asking Queries From The Expert The identity of the Farmer is first identified. The Farmers then clicks on the buttons depicting the picture related to their queries, (eg) a problem pertaining to Diseases of Crops would have a picture of crops on it. The question is then asked by the farmer in his voice (e.g. Hindi) and recorded, then the message is translated to the Expert‟s Language in text and is sent to the expert of that field. This process is implemented by a Speech Recognition Technique, converting the Farmer‟s Voice to Text and then a Language Translation converts the Text to the English Text and also the intermediate English Query thus produced would be stored in the database for a technique to come up in Phase 3. Phase 2: Experts Answering Back The Queries To the Query of the Farmer, the Specific Expert responds by answering back in text. The English Text is then converted into an Audio signal in the Language of Farmer (e.g. Hindi). This audio would be heard by the respective farmer who asked this query when he next time logs in the system. This process is implemented by a Language Translation the English answer thus written is translated back into farmer‟s language text and then Speech Synthesis converts the text in the farmer‟s language into audio signals. This audio reply is heard by the Farmer as a response to the query which

The elements of the pipeline are: 1) Transform the PCM digital audio into a better acoustic representation: The PCM audio, thus noticed by the sound card is converted into an acoustic representation which can then easily be transformed into a digital representation using a Fast-Fourier Transform (FFT). This digital representation can easily be understood by the computer and so it can work over it. 2) Figure out which phonemes are spoken: Here we begin by applying a "grammar" on the data so the speech recognizer knows what phonemes to expect. A grammar could be anything from a context-free grammar to full-blown Language. Hence the computer, fed in with a database of phonemes of that grammar, tried to figure out and identify the phonemes in the digitized data, and spots out the matching references

165

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010

(IJCSIS) International Journal of Computer Science and Information Security, Vol. XXX, No. XXX, 2010

3) Convert the phonemes into words: The identified phonemes are then searched in the database of words of that particular grammar, in the order in which those phonemes appeared in the digitized data, and so phonemeby-phoneme, a complete word is identified and made. B. Language Translation Also known as Machine Translation, is translating one natural language into another. Machine translation generally uses natural language processing and tries to define rules for fixed constructions. The original text is encoded in a symbolic representation from which the translated text is derived.

Figure 5. Showing Language Translation between English & Hindi. [8]

the part-of-speech based on the word endings, or by looking the word up in a lexicon. 3) Word pronunciation: The pronunciation module accepts the text, and outputs a sequence of phonemes. It first looks the word up in its own pronunciation lexicon, if not found it reverts to "letter to sound" rules, which are "trained" on a lexicon of hand-entered pronunciations. 4) Prosody: Prosody is the pitch, speed, and volume that syllables, words, phrases, and sentences are spoken with. Without prosody text-to-speech sounds very robotic. First, the engine identifies the beginning and ending of sentences, Engines also identify phrase boundaries and finally algorithms then try to determine which words in the sentence are important to the meaning, and these are emphasized 5) Concatenate wave segments: The speech synthesis is almost done by this point. All the text-to-speech engine has to do is convert the list of phonemes and their duration, pitch, and volume, into digital audio. D. Intelligent Information Retrieval The Information Retrieval refers to the process of retrieving information, which matches to a certain clause. Here we take care of matching the keywords in query to the, resources already available in the database, which may be done with the help of Pattern Matching techniques, which help us identify the the Keywords. i.e. it mainly takes into consideration the noun, verb, phrases, adjectives which may be of importance in the sentence and ignored the connectors such as prepositions. Then, it searches those keywords n the database and the best matching source having maximum keywords in it are taken into consideration. The word Intelligent is used, as it uses its knowledge to update the keywords to be searched in the database.

A new approach of machine translation is used i.e. statistical approach. The Computer is Fed in with billions of words of text, both monolingual text in the target language, and aligned text consisting of examples of human translations between the languages. Then the statistical learning technique is applied to build a translation model. Thus we can say that Statistical Machine Translation works by comparing large numbers of parallel texts that have been translated between Source and Target Languages and from these it learns which words and phrases usually map to others, which is analogous to the way humans acquire knowledge about other languages. The problem with statistical machine translation is that it requires a large number of translated sentences which may be hard to find. C. Speech Synthesis It‟s the artificial production of human speech. A text-to-speech system converts normal language text into speech. Text-tospeech fundamentally functions as a pipeline of processes that converts text into PCM digital audio.[9] The processes are: 1) Text Normalization: This component of text-to-speech converts any input text into a series of spoken words. Trivially, text normalization converts a string to a series of words. The Text Normalization works by: First, isolating words in the text and dealing them individuals, Second, it then searches for numbers, times, dates, and other symbolic representations. These are analyzed and converted to words. Then, abbreviations are converted to proper words and finally normalizer will use its rules to see if the punctuation causes a word to be spoken or if it is silent. 2) Homograph disambiguation: A "homograph" is a word with the same text as another word, but with a different pronunciation. So we must try to figure out what the text is talking about and decide which meaning is most appropriate in the given context. This is done by guessing

IV.

EVALUATING THE TECHNIQUES

The techniques were used in order to manipulate the data, and there examples may be considered, such as: Speech Recognition Table 1, highlights an example where Farmer speaks in Hindi, and the respective spoken speech is recognized by the computer and the error in recognized form and its percentage in then mentioned.
T ABLE 1: SPOKEN SENTENCE IS RECOGNIZED BY THE COMPUTER AS TEXT

Language Translation Table 2 and 3, points out to the translation of text. Where Translation of 1 language to other takes place, and the error

166

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010

(IJCSIS) International Journal of Computer Science and Information Security, Vol. XXX, No. XXX, 2010

in translation is mentioned, leading to a percentage of its error in translating.
T ABLE 2: RECOGNIZED T EXT IS TRANSLATED FROM H INDI TO E NGLISH

V. FUTURE SCOPE This system can be extended in many directions, which we found while developing it. As we have laid down the foundation, we can propose the extensions to our system, such as: The recent prices of the commodities can be listed on the screen for the farmers so that they get in hand information about the same.
Help farmers make online transactions of certain items. Real Time speech to speech translation, even according to Times Online, Google is developing a speech-to-speech automated translator for Android phones. Video conferencing can be integrated in this project, wherein the farmers can actually communicate with the experts via the video, and there being a total speech to speech translation between the two ends Encompassing Weather Forecasting within this project whereby the farmers will be warned if any unfavorable weather conditions prevail and so the farmers can grow the crops depending on the weather and without having to be a victim of uncertain weather. Multi Lingual system not only for Hindi but other regional language and foreign language Many questions are too elaborate & descriptive in nature. Techniques for extracting the questions from such kind of data must be seen.

In Table 2, we take that Hindi text in converted to English text, as the question was asked by the farmer in Hindi.
T ABLE 3: EXPERT‟ S REPLY TO THE FARMER ‟ S QUERY, & TRANSLATING IT FROM E NGLISH TO H INDI

In Table 3, English text is converted to Hindi Text, as a reply to farmer‟s question Speech Synthesis Table 4, emphasizes the use of the text-to-speech application. The translated text of the expert‟s answer is then made to be read out by the computer, the error in pronunciation and its percentage is also mentioned
T ABLE 4: T HE H INDI ANSWER IN TEXT IS THEN MADE INTO SPEECH

VI. PROTOTYPE OF „URBANIZING THE RURAL AGRICULTURE‟
Here we show screenshots of a working prototype of our software:

The results of the above errors are now shown in a chart, categorized in the sections as: Speech Recognition, Language Translation 1 & 2, and Speech Synthesis.

Figure 7. Implementation of Speech Recognition & Translation from Hindi to English

Figure 6. The error percentage occurring in various segments

This screen demonstrates the process when farmer asks the question in his language. Where it helps us recognize speech word by word, after pressing the button “Enable Speech”, and then the recognized Hindi words said by the farmers are converted into English by pressing the “Hindi to English Transform Button”.

167

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010

(IJCSIS) International Journal of Computer Science and Information Security, Vol. XXX, No. XXX, 2010 We are deeply grateful to our Project Guide, Mr. Mohd. Ahsan (CSE Dept), for his detailed and constructive comments, and for his important support throughout this work. A special thanks to Mr. Ashutosh Kashyap (CEO, CodeInnovations), for giving such an idea, to take over this project and his timely support and help for acquainting us with the new technologies.

REFERENCES
[1] Speech Recognition, http://en.wikipedia.org/wiki/Speech_recognition [2] Speech Synthesis, http://en.wikipedia.org/wiki/Speech_synthesis Figure 8. Implementing the Translation from English to Hindi [3] aAQUA – A Multilingual, Multimedia Forum for the Community, Krithi Ramamritham, Anil, Bahuman, Ruchi Kumar, Aditya Chand, Subhasri Duttagupta, G.V. Raja Kumar, Chaitra Rao, Media Lab Asia, IIT Bombay [4] E-choupal, http://www.itcportal.com/sets/echoupal_frameset.htm [5] Gyandoot,http://www.icmrindia.org/casestudies/catalogue/IT%20 and%20Systems/ITSY022.htm [6] Agriculture Information System, http://www.egovcenter.in/webinar/pdf/agriculture.pdf [7] Speech Recognition, http://electronics.howstuffworks.com/gadgets/high-techgadgets/speech-recognition1.htm [8] An Insight to Natural language Processing, Priyanka Vij, Harsh Chaudhary, Priyatosh Kashyap, Students, Dept. of Computer Science Engg,. Lingaya’s Insitute Of Mgt. & Tech, Faridabad, Haryana, India [9] Speech Synthesis, http://project.uet.itgo.com/textto1.htm

This Screen Demonstrates the process when expert answers the question in his language, to the query asked by the farmer. Where an expert, logs in the system and tries to answer the problem with his best knowledge. Then that answer written in English is converted into Hindi. Then this text would be converted into speech, which farmer can listen when he logs in the system.

VII.

CONCLUSION

Agriculture is the most important source of livelihood in India, but there are some problems still prevailing in agricultural field. Hence there is a need to enhance it in order to overcome these problems. For this we have tried to optimize the agricultural outputs using technology. We have tried to bridge the gap between the farmers and the experts. The farmers will get instant solutions to their queries on a real time basis or at maximum within 24hours.This will not only help the farmers getting the correct solution but will also save their time which would have been wasted in going to that expert for getting the solution for a particular query and also the money to communicate with him. The natural language processing is a major part of this project, which would convert the farmer‟s language (e.g. Hindi) into English which is globally an official language. Since there is no language barrier due to the application of natural language processing, so the query can be asked to the international experts as well. Hence this is our initiative towards development of interactive software which would help the farmers to employ the latest technologies and enhance their crop‟s productivity. Since the Rural Agriculture is the most undeveloped part of our country, till we don‟t find ways to improve it, we are hindering the progress of our country. This won‟t only affect the ‘Growth Rate of the Indian Economy’, but also the ‘Global Growth Rate’. As when we‟ll grow, it‟ll help the world grow. ACKNOWLEDGEMENT
We are heartily thankful to Dr. T. V. Prasad (HOD, C.S.E Dept, Lingaya’s Institute Of Management & Technology) whose encouragement, guidance and support from the initial to the final level enabled us to develop an understanding of the subject.

AUTHORS PROFILE Priyanka Vij, a final year computer science student at Lingaya‟s Institute of Mgt. & Tech., Faridabad, Haryana, India. Her areas of interest include Bio Informatics, Natural Language Processing and Software development life cycle. Harsh Chaudhary, a final year computer science student at Lingaya‟s Institute of Mgt. & Tech., Faridabad, Haryana, India. His areas of interest include Computer architecture and Natural Language Processing and project management-areas.

Priyatosh Kashyap, a final year information technology student at Lingaya‟s Institute of Mgt. & Tech., Faridabad, Haryana, India. His areas of interest include Virtual Reality, Natural Language Processing and Robotics.

168

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, April 2010

(IJCSIS) International Journal of Computer Science and Information Security, Vol. XXX, No. XXX, 2010

169

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->