/  7
 
CMPUT 498 Computer Vision
Vision-based Mouse Interaction ModelMark McElhinneyApril 26, 2005
Overview:
Computers are an integral and essential part of every industry today. However, thereremains only one primary method of interaction with most computers, namely the mouse.Although practical and efficient in most circumstances, many situations would be betterserved by other means such as touch-sensitive monitors. Many companies have created just such devices; desktop monitors with touch sensitivity, or large scale interactivewhiteboards that work in conjunction with projectors (SMART Technologies Inc, GTCO,etc).Two of the primary set-ups for most interactive monitors are shown below:Fig 1.1: Rear(left) and Front(right) Projection SMARTBoardsPictures taken fromwww.smarttech.comThe technology in both cases limits the degree of interaction, however. In the case of thefront-projection setup, a multi-layered proprietary canvas detects contact when its layersare pressed together with enough force. This limits the number of reported contacts toone at a time, and no information as to the size of contact is known. In the case of therear-projection DViT (Digital Vision Touch) system, 3 or 4 digital cameras are used totriangulate the location, size, and number of contacts. However, these cameras arelocated in the corners of the screen, and look for contact parallel to the screen’s surface.As a result, a small number (2-3) contacts can be reported, and fewer in cases when onecontact occludes the location of another.It would be more useful for some purposes to be able determine larger numbers ofcontacts, size of individual contacts, and even shape of contacts. This would allow amuch more diverse interaction model, providing endless degree of flexibility and use(imagine a whole hand contact acting as a panning tool, and a single finger acting as amouse).
 
It is the goal of this project to set up the framework and prove the feasibility (or lack offeasibility) of a touch based interaction system based on a rear-projection and rear-capture camera setup.
Hardware:
Like other rear-projection devices, I used a home-made combination of a projector,mirror, Plexiglas surface, and a plywood cabinet to perform my tests. My original thoughtsled me to a design such as the one below:While building it, however, I realized that a more usable device could be created byeffectively tilting the entire device 25-30 degrees backwards. This would allow the cabinetto sit on the floor while I performed my tests, and a direct line of sight perpendicular to thetouching surface is created. Additionally, a projector was added behind the camera(projecting directly above the camera) to make the set-up complete. The following aresome images of the test set-up: 
 
Although made very inexpensively, the setup actually produced quite good results. Thefront fogged-glass surface used to display the projection was actually made of a clearsheet of Plexiglas sanded with 120 grit sandpaper. Other surfaces may have performedbetter but for my purposes this was adequate. I used a basic Firewire webcam videocamera for image capturing and used a Videoball projector to project the desktop imageonto the mirror and ultimately onto the screen.
Software
In order to track contacts on the Plexiglas surface, I had to design a software componentto analyze captured images and interpret them as contacts. To do this, I used Visual C++on a Windows XP machine. I developed a MFC forms application that displays the rawvideo captured from the webcam in one frame and displays either the filtered and post-processed image or sequence of images in the other frame.I implemented two separate methods to attempt to isolate regions of the image thatcoincided to screen contacts. I created a section where I could capture individual framesfrom the input source, and incrementally apply image processing techniques. In thissection, implemented the following image processing algorithms:- thresholding between two values- 3x3 median filter- 3x3 mean filter- pixel removal based on key-stoned image- conversion to grayscaleThe second and most useful method, seen on the right of the screenshot, performs asequence of processing techniques in set order on either one or more images. Bychoosing ‘Process Video’, a timer is used to process a frame on set intervals, displayingthe X and Y coordinates of the centroid of the most probable contact on the screen in thelabels below.As seen on the next page, a contact is detected and a bounding box created around it.Additionally the centroid is reported at (190, 156).

Share & Embed

More from this user

Add a Comment

Characters: ...