You are on page 1of 32

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition

CHAPTER 1
1. INTRODUCTION
One of the important challenges in Human Computer Interactions is to develop more
intuitive and more natural interfaces. Computing environments presently are strongly tied to the
availability of a high resolution pointing device with a single, discrete two dimensional cursor.
Modern Graphical user interface (GUI), which is a current standard interface on personal
computers (PCs), is well-defined, and it provides an efficient interface for a user to use various
applications on a computer. GUIs (graphical user interfaces) combined with devices such as mice
and track pads are extremely effective at reducing the richness and variety of human
communication down to a single point.
While the utility of such devices in today’s interfaces cannot be denied, there are many
users who find that the capability of GUI is rather limited when they try to do some tasks by
using gestures. There are opportunities to apply other kinds of sensors and techniques to enrich
the user experience of such users. For example, video cameras and computer vision techniques
may be used to capture many details of human shape and movement. The shape of the hand may
be analyzed over time to manipulate an onscreen object in a way analogous to the hand’s
manipulation of paper on a desk. Such an approach may lead to a faster, more natural, and more
fluid style of interaction for certain tasks.
Ubiquitous computing is devoted to changing the relationship between humans and the
computers with which we interact, towards allowing computers to become invisible and recede
into the periphery of people’s lives.
Our project, Human Computer Interaction Where Controlling Computer and
Applications Using Image Processing and Voice recognition is an attempt in ubiquitous
computing. Here, we will be using colored tapes on our fingers. One of the tapes will be used for
controlling cursor movement while the relative distance between the two colored tapes will be
used for click events of the mouse. And center color tape we are using for gestures Also, we will
be enriching our system with voice recognition capability to perform basic actions like
shutdown, search and surfing thus, the system will provide a new experience for users in
interacting with the computer

ISB&M School of Technology Pune

Page 1

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition

CHAPTER 2
1. PROBLEM DEFINITION

The project that we are trying to develop will completely change the way people are going to use
the computer system. Presently, we are using the camera and mice to detect the hand gesture and
voice commands to control the computer and its applications.
Also this would lead to a new era of Human Computer Interaction
(HCI) where no physical contact with the device is required. And it can be used in many media
application as well as new product designs. It can be used in advertisement industry for the
natural user interface so that user can be connected with the advertisers more effectively.

ISB&M School of Technology Pune

Page 2

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition

CHAPTER 3
3. LITERATURE SURVEY
3.1.

RELATED WORK
A lot of research is being done in the fields of Human Computer Interaction (HCI) and

Robotics. Researchers have tried to control mouse movement using video devices for HCI.
However, all of them used different methods to make mouse cursor movement and clicking
events.
3.1.1] A Method for Controlling Mouse Movement using a Real-Time Camera. [1].
Hojoon Park
One approach, by Hojoon Park [1] used index finger for cursor movement and angle between
index finger and thumb for clicking events.
Working:
Hojoon Park [1] used index finger for cursor movement. In his system Hojon park [1] used index
finger movement to move mouse movements on the computer. In which he used the effective
algorithm to detect the fingers. Hojoon Park [1] showed that we can use angle between finger
and thumb for clicking events.
To recognize that a finger is inside of the palm area or not, He used a convex hull algorithm. The
convex hull algorithm is used to solve the problem of finding the biggest polygon including all
vertices. Using this feature of this algorithm, He can detect finger tips on the hand. He used this
algorithm to recognize if a finger is folded or not. To recognize those states, He multiplied 2
times (He got this number through multiple trials) to the hand radius value and check the
distance between the center and a pixel which is in convex hull set. If the distance is longer than
the radius of the hand, then a finger is spread. In addition, if two or more interesting points
existed in the result, then He regarded the longest vertex as the index finger and the hand gesture
is clicked when the number of the result vertex is two or more.

Advantages:
He developed a system to control the mouse cursor using a real-time camera. He implemented all
mouse tasks such as left and right clicking, double clicking, and scrolling. This system is based
on computer vision algorithms and can do all mouse tasks
ISB&M School of Technology Pune

Page 3

the fingertip position detected by convex hull algorithm is also changed. Note that the arrow pointer is the only feedback the user gets as to where the user is pointing. The user walks up to the kiosk. Hence the position of the hand changes every frame. If hand shape is not good then our algorithm cannot work well because our algorithm assumes the hand shape is well segmented. without the distraction of a distinct and different other signal. And the Algorithms are simple and take less times. the problem was when the finger shook at lot. Thus. If the hand shape is not good then we cannot estimate the length of radius of the hand. People approaching the kiosk are tracked by a robotic head called IGOR (Intelligent Gaze Oriented Robot)[3] described below. Another problem by illumination issue is segmentation of the background for extracting the hand shape..Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition The Hojoon Park [1] used Simple Effective System which is less complicated. When the user makes a recognized hand sign the kiosk allows movement of the hand to move the mouse pointer on the kiosk display. ISB&M School of Technology Pune Page 4 . Then the mouse cursor pointer shakes fast. January 2004 [3] Robertson P..[3] Working There solution was to develop a virtual mouse that enables users to control the kiosk with hand signs and movements. 3.1. adjusting to imperfections in tracking. Limitations: In this project. The kiosk has a standard visual user interface. Since the hand reflects all light sources. the hand color is changed according to the place. Laddaga R. with arrowcursor to indicate pointer movement. To fix this problem. the Illumination changes every frame. The user can use that feedback. The system is easy to use can be used to control small applications.2] Virtual mouse vision based interface. Since He used real-time video. This constraint worked well but it makes it difficult to control the mouse cursor sensitively. we added a code that the cursor does not move if the difference of the previous and the current finger tips position is within 5 pixels. Separate hand signs allow for clicking of the mouse buttons for making selections on the kiosk display. Van Kleek M.

This results in a single large region being segmented by the background subtraction and skin detection phases. In real-time. Limitations: The System was not Capable for complex operations and It cannot control high end system and applications which lowers its scope. It is robust because no recognition is required to achieve mouse motion. ISB&M School of Technology Pune Page 5 .1. In the sign tracking state the optical flow is given the higher priority. and when present the pose of the thumb as well. the system can track the 3D position and 2D orientation of the thumb and index finger of each hand without the use of special markers or gloves. December 18. resulting in up to 8 degrees of freedom for each hand. In this way good responsiveness is achieved in both user tracking and gesture tracking. Limitations: Misclassification problem occurs when two hands appear close together in the captured images.[4] The system can extract the 3D position and 2D orientation of the index finger for each hand. Advantages: This project presents the implementation and analysis of a real-time stereo vision hand tracking system that can be used for interaction purposes. 2003[4] Working: In this system. and it also provides smooth motion estimates. Optical flow allows smooth tracking of the hand gestures. Therefore the contour detector interprets the two hands as a single hand and thus the fingers are labelled as a right hand. The system uses two low-cost web cameras mounted above the work area and facing downward. In interactive applications a single pointing gesture could then be used for selection operations while both the thumb and index finger could be used together for pinching gestures [4] in order to grasp and manipulate virtual objects.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition Advantages: In the face tracking state the face recognition is a higher priority than the sign recognition. 3. which is primarily based on the single hand tracker presented in [Segen99].3] Real-time Hand Tracking and Finger Tracking for Interaction [4] Shahzad Malik CSC2503F Project Report.

the system will be misleading to a difference result. If the edge color of projected screen is similar to its neighbour objects. then they use Motion History Images.A Real-time Hand Mouse System on Handheld Devices [9] Chu-Feng Lien. They used Adaboost [9] method proposed by Viola and Jones seem to be a choice for the project. system will result in a false alarm. The Low resolution Cameras Can be supported. The screen will not be well detected. the system can detect a user's hand motion in real-time. Advantages: There system can find Projected screen and that is so much useful for the Projection purpose The system is more convenient because it can be used on all the portable devices. they did not have a good result with a low-resolution camera. By adopting their embedded cameras.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 3. but if the speaker suddenly stops in the middle of the screen.1.[9] Working: They assume the popularity of equipping a low-resolution camera on those handheld devices. and results in the autonomous manipulation of corresponding programs on the device. computing power of frame processing is a critical concern. By grabbing and processing the image on pixels directly. To run the vision-based HCI system on the handheld devices. If the environmental lights are changing or a shadow is projected within the scope of the camera. ISB&M School of Technology Pune Page 6 .4] Portable Vision-Based HCI . they gain an efficiency way of computing as a result. the system can adapt to this behaviour and performs well. If the speaker walks around the projected area continuously. Limitations: High error rate on fast moving motion (this issue can be improved by increasing the framing rate) will result in high false positive detection. Instead of using gesture recognition methods.

we will be enriching our system with voice recognition capability to perform basic actions like shutdown. Human Computer Interaction Where Controlling Computer and Applications Using Image Processing and Voice recognition is an attempt in ubiquitous computing. doubleclicking and scrolling. such as clicking (right and left). We employ several image processing algorithms to implement this. and in particular vision-based gesture and object recognition. search and surfing thus. such as embedded keyboard. By applying vision technology and controlling the mouse by natural hand gestures. One of the tapes will be used for controlling cursor movement while the relative distance between the two colored tapes will be used for click events of the mouse. Our project. Simple interfaces already exist. we can reduce the work space required. Touch screens are also a good control interface and nowadays it is used globally in many applications. Increasingly we are recognizing the importance of human computing interaction (HCI). SOFTWARE REQUIREMENT SPECIFICATION 4. Here. these interfaces need some amount of space to use and cannot be used while moving. folder-keyboard and mini-keyboard. the system will provide a new experience for users in interacting with the computer ISB&M School of Technology Pune Page 7 . There is a need for new interfaces designed specifically for use with these smaller devices. In this paper. However. we propose a novel approach that uses a video device to control the mouse system.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition CHAPTER 4 2. touch screens cannot be applied to desktop systems because of cost and other hardware limitations. And center color tape we are using for gestures Also. This mouse system can control all mouse tasks.1 INTRODUCTION As computer technology continues to develop. However. people have smaller and smaller electronic devices and want to use them ubiquitously. we will be using colored tapes on our fingers.

Where user interface is developed using java media framework and Camera should be compatible with the system. Restart. • Controlling Application like Games. Image Viewer. Communication Interfaces Graphical user interface is most convenience way to do the interaction with the system. plus providing the system with the required data access components. ISB&M School of Technology Pune Page 8 .3 EXTERNAL INTERFACE REQUIREMENT There are many types of interfaces such as User Interface. Software Interfaces The system utilizes JDK(1. So in our proposed system we have design and developed GUI to interact with the system by using Java Swing classes.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4. etc. • Perform Basic Operations Through Voice Recognition like Search. 4.2 SYSTEM FEATURES We introduce an effective way of Human Computer Interaction where the proposed system has following modules • Controlling the Mouse Movements through Hand Movements. Software Interface and Hardware Interface. Shutdown. Here we are using Users hand gesture through webcam and process it using image processing techniques and we are using voice recognition to make it more natural interaction with the system. Maps. • Controlling Media Player Options through Hand Gestures.6) framework which provides it with the necessary components to build system components and objects. User Interfaces The user interface for the software shall be compatible to windows operating system.

disruption.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4.4 FUNCTIONAL REQUIREMENT Class Function Requirement Color_Tape_Detection Color_Extraction HSV Color Detection Algorithm blob_Detection Histogram-based Skin Classifier Cursor_Move Mapping Cursor Control Algorithm. modification or destruction. ISB&M School of Technology Pune Page 9 . use. Information security means protecting information and information systems from unauthorized access. Weighted Speed Cursor Control Algorithm Voice_Recognition Voice_Recognise Dynamic Time Wrapping Algorithm Table 1 4.5 NON FUNCTIONAL REQUIREMENT Performance Requirements  High Speed: System should process voice messages in parallel for various users to give quick response then system must wait for process completion.  Better component design: To get better performance at peak time Security Requirements  Secure access of confidential data (user’s details). disclosure.

Extensibility  Extensibility allows adding new component to the system. debug or isolate faults to root cause analysis.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 1. Voice database must secure. Access will be controlled with usernames and passwords. User password must be stored in encrypted form for the security reason 2. and Usability). Serviceability  In software engineering and hardware engineering. configure. ISB&M School of Technology Pune Page 10 . All the user details shall be accessible to only high authority persons. is one of the aspects (from IBM's RASU (Reliability. replaces the existing ones. 3. Both may be geographically distributed. It refers to the ability of technical support personnel to install. Flexible service based architecture will be highly desirable for future extension Scalability  The solution should be able to accommodate high number of customers and brokers. Availability. and monitor computer products. This is done without affecting the components that are in their original places. 4. identify exceptions or faults. Compatibility  Compatibility is the measure with which user can extend the one type of application with another. serviceability also known as supportability. Serviceability. and provide hardware or software maintenance in pursuit of solving a problem and restoring the product into service.

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4.1 DFD level 0 Figure 1 2.1.6.6 ANALYSIS MODEL 4.1 DFD level 1 Figure 2 ISB&M School of Technology Pune Page 11 .

6.4 Class Diagram: Figure 4 ISB&M School of Technology Pune Page 12 .2 DFD level 2 Figure 3 4.1.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 2.

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4.6.4 GHZ (intel or AMD)  Hard disk : 80 GB  RAM : 1 GB. Windows XP onwards.7.1 Hardware:  CPU : 2.7 SPECIFIC REQUIREMENTS Project System will be Windows based supporting versions.5 ER Diagram Figure 5 4. The minimum configuration required for system is: 4.  Camera : 2 Megapixel (minimum) and 30 FPS (Frame per Seconds) ISB&M School of Technology Pune Page 13 .

Tested and Executable Project Model.8 SYSTEM IMPLEMENTATION PLAN Sr.Hardware Requirements & Budget 29th August 2014 5th September 2014 5. Algorithm Design 10th September 2014 13th September 2014 7. 5th January 2014 20th February 2014 11. Study of Project related Technology 1st November 2014 3rd January 2014 10. 16th March 2014 29th March 2014 13.2 Software:  JDK 1. Working Model and Testing.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4. 24th February 2014 15th March 2014 12. DFD . Topic search and Finalization 4th July 2014 25th July 2014 2.7. Coding and Implementation of Project. UML diagrams 8th September 2014 10th September 2014 6. Preliminary Report 22nd October 2014 29th September 2014 9. Literature Survey 8th August 2014 21st August 2014 3. Final Report and Deployment of Project. Algorithm Analysis 15th September 2014 21st October 2014 8.6  NetBeans IDE  Java Advance Imaging  Java Media Framework  SAPI 4. 2nd April 2014 20th April 2014 Table 2 ISB&M School of Technology Pune Page 14 . Planning Start Date Completion Date 1. No. Objective And Planning 22nd August 2014 28th August 2014 4. Software .

9 BUDGET Sr. Computer 1 20000 2.6 1 OSS 6.1 1 OSS 5.0. Product Quantity Cost 1 1. Windows XP 1 3000 4. STAR UML 1 OSS Total 27500 Note: OSS (Open Source Software) Table 3 ISB&M School of Technology Pune Page 15 .Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4. Net BeansIDE 7. No. WebCamera 1 2500 3. JDK 1.

SYSTEM DESIGN 5. Figure 6 ISB&M School of Technology Pune Page 16 . Voice Recognition Module 3.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition CHATPER -5 5. Functional Block 2. 1. Technology Functional Block Takes a Camera feed as a input and process it through image processing algorithms as well as voice recognition module takes a voice commands as a input and process those voice commands both the functional block and voice recognition module performs the processing using technology block.1 SYSTEM ARCHITECTURE: In our Proposed System We Have 3 Main Blocks.

or a class. Each one shows a set of use cases and actors and their relationships. Use case diagrams are central to modelling the behaviour of a system. or modelling the requirements of the behaviour of these elements. and collaboration diagrams are four other kinds of diagrams in the UML for modelling the dynamic aspects of systems). a subsystem. and documenting the behaviour of an element. and classes approachable and understandable by presenting an outside view of how those elements may be used in context. or class. this involves modelling the context of a system. sequence diagrams. Use case diagrams are important for visualizing. specifying.2.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4. You apply use case diagrams to model the use case view of a system.2 UML DIAGRAMS 4. state chart diagrams. Use case diagrams are also important for testing executable systems through forward engineering and for comprehending executable systems through reverse engineering. subsystem. They make systems. Figure 7 ISB&M School of Technology Pune Page 17 . For the most part. subsystems.1 USE-CASE Diagram Use case Diagram Use case diagrams are one of the five diagrams in the UML for modelling the dynamic aspects of systems (activity diagrams.

2.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4. interfaces. Class diagrams that include active classes address the static process view of a system Figure 8 ISB&M School of Technology Pune Page 18 . These diagrams are the most common diagram found in modeling object-oriented systems.2 Activity Diagram A class diagram shows a set of classes. and collaborations and their relationships. Class diagrams address the static design view of a system.

Figure 9 ISB&M School of Technology Pune Page 19 . consisting of states.3 Sequence Diagram A statechart diagram shows a state machine. or collaboration. and activities.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 4.2. You use statechart diagrams to illustrate the dynamic view of a system. events. transitions. Statechart diagrams emphasize the event-ordered behaviour of an object. They are especially important in modelling the behaviour of an interface. which is especially useful in modelling reactive systems. class.

The API can benefit all Java developers who want to incorporate imaging into their Java applets and applications. high-performance image processing to be incorporated into Java applets and applications.  Support for a wide variety of data types. and frequency domain operators. area. It is a set of classes providing imaging functionality beyond that of Java 2D and the Java Foundation classes.  Deferred Execution.1. The Java Advanced Imaging API is intended to meet the needs of technical (medical.) as well as commercial imaging (such as document production and photography).6 JDK (Java Development Kit) is a free software development package from Sun Microsystems that implements the basic set of tools needed to write. Features of JAI  Rich set of functionality for digital imaging. 6.1 TECHNOLOGY USED IN PROJECT 6. etc. remote sensing. This API implements a set of core image processing capabilities including image tiling. and hardware acceleration.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition CHAPTER -6 5. deferred execution and a set of core image processing operators. regions of interest.1. operator optimization. ISB&M School of Technology Pune Page 20 . including many common point. test and debug Java applications and applets.  Allow multiple implementations with different trade-offs of memory usage. TECHNICAL SPECIFICATION 6.  High level of extensibility to allow arbitrary processing capabilities. seismological.2 Java Advance Imaging The Java Advanced Imaging API extends the Java 2 platform by allowing sophisticated. though it is designed for compatibility with those APIs.1 JDK 1.  Remote Imaging and truly distributed imaging.

and MIDI. H263. HTTP. High performance custom implementation of media players. G723. ISB&M School of Technology Pune Page 21 . processing and delivery of time-based media.  Do real-time streaming of media over the Internet.  JMF uses a well-defined event reporting mechanism that follows the “Observer” design pattern. JMF provides a plug-in architecture that allows JMF to be customized and extended. The JMF support the reception and transmission of media streams using Realtime Transport Protocol (RTP) and JMF supports management of RTP sessions. 6.  Store media into a file. RTP.2 ADVANTAGE  Hand as an acceptable tool to control a computer cursor. MPEG-2. and RTSP. JMF supports popular media access protocols such as file. Technology providers can extend JMF to support additional media formats.1. MPEG-1. GSM. JMF uses the “Factory” design pattern that simplifies the creation of JMF objects. QuickTime. adding special effects). protocols and delivery mechanisms. JMF is an optional package of Java 2 standard platform. JMF provides a unified architecture and messaging protocol for managing the acquisition. MP3.3 Java Media Framework JMF is a framework for handling streaming media in Java programs. AVI. WAV. JMF enables Java programs to  Present ( playback) multimedia contents.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 6.  Capture audio through microphone and video through Camera. or codecs possibly using hardware accelerators can be defined and integrated with the JMF. FTP. HTTPS. Features of JMF  JMF supports many popular media formats such as JPEG.  It will enable people to interact with computers without physical contact.  JMF scales across different media data types.  It Will give more natural way to interact with computer  Benefits from Finger Mouse & Voice in other applications as commercials and interactive advertisements.  Process media ( such as changing media format.

3 APPLICATIONS  Our system can be used for the human computer interaction  It can also used to control computer  It can be used to control various software applications  Our system can be used to control power point presentation  Our system can be used to control or play games ISB&M School of Technology Pune Page 22 .Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition 6.

Our product which uses only two of them i. This technology can be further enhanced for use in robotics. ISB&M School of Technology Pune Page 23 . CONCLUSION The product that we are trying to develop will improve the way people are going to use the computer system. microphone and mouse are an integral part of the computer system.e. Presently.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition CHAPTER 7 7. Also this would lead to a new era of Human Computer Interaction (HCI) where no physical contact with the device is required. the webcam. webcam and microphone would may eliminate the mouse. gaming and developing systems which could understand human behaviour based on their way of interaction.

Shadow gestures: 3D hand pose estimation using a single camera. January 2004. Robertson P. 1. Kumar. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Atalay. IEEE International Conference 6. Acoustics. Fast tracking of hands and fingertips in infrared images for augmented desk interface. 2002. Gesture Recognition. December 18. (ICASS).Stephen Tu. A Fast Algorithm for Vision-Based Hand Gesture Recognition for Robot Control. Austin. 462-467.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition CHAPTER -8 REFERENCES 1. In Proceedings of IEEE. of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects. Sato.A Real-time Hand Mouse System on Handheld Devices. 10. 1999. pp. Proceedings. International Conference on Automatic Face and Gesture Recognition (FG). 7. 4. Computer vision based mouse. Rehg. Van Kleek M. Speech. H. Yardimci. and Mujdat Cetin. Hojoon Park. Hart Lambur. Koike. Kobayashi. 8. Virtual mouse vision based interface. 2000. 9. 2003 5. Y. CS4731 Project. V. Portable Vision-Based HCI .. White Paper. A. pages 16-22. Erdem. Y. E. 11. DigitEyes: Vision-Based Hand Tracking for HumanComputer Interaction. and Signal Processing.. Texas. ISB&M School of Technology Pune Page 24 . Laddaga R. pp. 2. A. 479-485. Y. E. S.Asanterabi Malima. A Method for Controlling Mouse Movement using a Real-Time Camera. HSV Color Detection Algorithm. December 21. Erol Ozgur. Shahzad Malik. Chu-Feng Lien. Vol. Blake Shaw. James M. Real-time Hand Tracking and Finger Tracking for Interaction CSC2503F Project Report. J. Proc. November 1994.2004. Segen. Cetin. Takeo Kanade.. 3.

ISB&M School of Technology Pune Page 25 . such that G = {E. We define the vertices as.. v10}.. v3. vertices in the set V represent the modules which are connected through directed edges in the set E representing the input/output of modules.... v1. e2. In the graphical representation of the system... V} where E represents the set of edges.. V = {v0. e14} and V is a set of vertices. v2. E = {e0. e3 ..Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition ANNEXURE ANNEXURE A: Project Analysis of Algorithm Design Project Analysis: Figure 10 Let G be a closed graph that represents our system of mouse simulation and application control. e1..

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition VERTEX MODULE v0 Initialize v1 Image Capture v2 Voice Capture v3 Finger Tape Colour Detect v4 Voice Recognition v5 Event Detect v6 Mouse Operation Events v7 OperationGesture events v8 Key Strokes v9 Voice Operations v10 Aggregation Table 4 We define the edges as. EDGE INPUT/OUTPUT e0 Call to camera e1 Call to microphone e2 Image frames e3 Voice records e4 Pixel position e5 Distance between tapes and wait time e6 Voice command e7 Operation Gesture Detected e8 Gesture For Key stroke e9 Voice Command ISB&M School of Technology Pune Page 26 .

... fe (e9) = {v9}..v7 is called for Operation Gesture event using e7.. e13 aggregate at v10.... fe (e7) = {v7}... fe (e2) = {v3}.. Thus. fe (e5) = {v5}..... fe (e0) = {v1}..position is passed to v6 using e4 for cursor movement. fe (e6) = {v5}..v9 is called for Voice Operation event using e9... fe (E) |→ V..... fe (e13) = {v10} fe (e14) = {v0}... fe (e4) = {v6}......distance between coloured tapes or wait time is passed to v5 using e5 for event detection......v8 is called for key stroke event using e8.. fe (e1) = {v2}.. fe (e8) = {v8}.... e12...frames are passed to v3 using e2 for detection... ISB&M School of Technology Pune Page 27 ...v0 is called again to iterate using e14. for our system..v2 is called using e1 to capture voice.v1 is called using e0 to capture image. it returns vertices........ fe (e10) ={v10} fe (e11) = {v10} fe (e12) = {v10} e10. e11... fe (e3) = {v4}.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition e10 Call to aggregation module e11 ---------------“---------------- e12 ---------------“---------------- e13 ---------------“---------------- e14 Iteration Call Table 5 Let fe be a rule of E into V such that for given edge.voice data is passed to v4 using e3 for recognition....voice command is passed to v5 using e6 for event detection...

the complexity is logarithmic and it is given as – ___________________________________________________________________________ O( mn + (mn/k2 )log(mn/k2 )) where (m x n) are width and height of image and (k x k) is segmentation blob respectively. COMPLEXITY Our system involves three main modules –  Image recognition and analysis (v3)  Voice recognition and analysis (v4)  Event selection (v5) For a standard image recognition and analysis module/system. for a standard voice recognition module/system. total complexity of our system is given as ISB&M School of Technology Pune Page 28 . e5. and voice commands (e6).Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition Overloading on {e4. the complexity is quadratic in nature and it is given as – ___________________________________________________________________________ O( n2 v) where v is number of words in dictionary and n is length of sequence respectively. distance between coloured tapes and wait time (e5). the complexity is – ___________________________________________________________________________ O(n). ___________________________________________________________________________ And the complexity of the event selection module depends on the number of events involved in it. ___________________________________________________________________________ Thus. ___________________________________________________________________________ Also. e6} The mouse movement and events are overloaded using pixel position (e4). Thus for n events.

ISB&M School of Technology Pune Page 29 . For our system. Overall complexity of our system nearly comes out to be O(n2). algorithms are deterministic and the overall complexity is O(n2) which shows that it is in P class.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition ___________________________________________________________________________ Total complexity = Image recognition complexity + Voice recognition complexity + Event selection complexity 2 = O( mn + (mn/k )log(mn/k2 )) + O( n2 v) + O(n) ___________________________________________________________________________ Hence. P Class Problem A problem is in P class if it is solvable in polynomial time by a deterministic algorithm.

errors or missing requirements in contrary to the actual desire or requirements. without using any automated tool or any script.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition ANNEXURE B: Project Quality and Reliability Testing of Project Design Testing is the process of evaluating a system or its component(s) with the intent to find that whether it satisfies the specified requirements or not. This activity results in the actual.  Black Box Testing: The technique of testing without having any knowledge of the interior workings of the application is Black Box testing. is when the tester writes scripts and uses another software to test the software. ISB&M School of Technology Pune Page 30 .e. There are different types of testing which may be used to test a Software during SDLC. White box testing is also called glass testing or open box testing.  Manual testing: This type includes the testing of the Software manually i.  Automation testing: Automation testing which is also known as Test Automation. There are different methods which can be use for Software testing. expected and difference between their results. In simple words testing is executing a system in order to identify any gaps.  White Box Testing: White box testing is the detailed investigation of internal logic and structure of the code.  Grey Box Testing: Grey Box testing is a technique to test the application with limited knowledge of the internal workings of an application.

Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition ANNEXURE C ABBREVIATIONS A AMD: Advanced Micro Devices API: Application Programming Interface AVI: Audio Video Interleave C CPU: Central Processing Unit D DFD: Data Flow Diagram E ER Diagram: Entity Relationship Diagram F FPS: Frame per Seconds FTP:File Transfer Protocol G GB: GigaByte GUI : Graphical user Interface GSM: Global System for Mobile H HCI: Human Computer Interaction HSV: Hue Saturation Value HTTP: Hyper Text Transfer Protocol HTTPS:Hyper Text Transfer Protocol Secure I IGOR: Intelligent Gaze Oriented Robot IBM: International Business Machines IDE: Integrated Development Environment ISB&M School of Technology Pune Page 31 .

Serviceability.Human Computer Interaction Where Controlling Computer And Applications Using Image Processing and Voice Recognition J JDK(1.6 JAI: Java Advance Imaging JMF: Java Media Framework JPEG: Joint Photographic Experts Group M MPEG-1: Motion Picture Experts Group 1 MPEG-2: Motion Picture Experts Group 2 MP3: MPEG-2 Audio Layer III MIDI: Musical Instrument Digital Interface O OSS:Open Source Software P PC : Personal Computer R RASU: (Reliability.6): JAVA Development Kit 1. S SAPI: Speech Application Programming Interface SDLC: Software Development Life Cycle U UML: Unified Modelling Language W WAV: WAVEform audio format Numbers 3D: Three Dimensional 2D: Two Dimensional ISB&M School of Technology Pune Page 32 . and Usability) RAM: Random Access Memory RTP: Real-Time Transport Protocol RTSP: Real-Time Streaming Protocol. Availability.