You are on page 1of 5

SixthSense

Manisha Sharma
Computer Science
Dronacharya College of Engineering
manisha.s2212@gmail.com
Abstract components are coupled in a pendant like mobile
wearable device. Both the projector and the camera are
SixthSense' is a wearable gestural interface that augments connected to the mobile computing device in the user’s
the physical world around us with digital information and pocket. The projector projects visual information
lets us use natural hand gestures to interact with that enabling surfaces, walls and physical objects around us
information. It was developed by Pranav Mistry, (PhD to be used as interfaces; while the camera recognizes and
student in the Fluid Interfaces Group at the MIT Media tracks user's hand gestures and physical objects using
Lab).By using a camera and a tiny projector mounted in a computer-vision based techniques. The software program
pendant like wearable device, 'SixthSense' sees what you processes the video stream data captured by the camera
see and visually augments any surfaces or objects we are and tracks the locations of the colored markers (visual
interacting with. It projects information onto surfaces, tracking fiducials) at the tip of the user’s fingers using
walls, and physical objects around us, and lets us interact simple computer-vision techniques. The movements and
with the projected information through natural hand arrangements of these fiducials are interpreted into
gestures, arm movements, or our interaction with the gestures that act as interaction instructions for the
object itself. This wearable gestural interface attempts to projected application interfaces. The maximum number
free information from its confines by seamlessly of tracked fingers is only constrained by the number of
integrating it with reality, and thus making the entire unique fiducials, thus SixthSense also supports multi-
world your computer. touch and multi-user interaction.

Introduction The SixthSense prototype implements several


applications that demonstrate the usefulness, viability and
'SixthSense' is a wearable gestural interface that flexibility of the system. The map application lets the
augments the physical world around us with digital user navigate a map displayed on a nearby surface using
information and lets us use natural hand gestures to hand gestures, similar to gestures supported by Multi-
interact with that information .We've evolved over Touch based systems, letting the user zoom in, zoom out
millions of years to sense the world around us. When we or pan using intuitive hand movements. The drawing
encounter something, someone or some place, we use our application lets the user draw on any surface by tracking
five natural senses to perceive information about it; that the fingertip movements of the user’s index finger.
information helps us make decisions and chose the right SixthSense also recognizes user’s freehand gestures
actions to take. But arguably the most useful information (postures). For example, the SixthSense system
that can help us make the right decision is not naturally implements a gestural camera that takes photos of the
perceivable with our five senses, namely the data, scene the user is looking at by detecting the ‘framing’
information and knowledge that mankind has gesture.
accumulated about everything and which is increasingly The user can stop by any surface or wall and flick
all available online. Although the miniaturization of through the photos he/she has taken. SixthSense also lets
computing devices allows us to carry computers in our the user draw icons or symbols in the air using the
pockets, keeping us continually connected to the digital movement of the index finger and recognizes those
world, there is no link between our digital devices and symbols as interaction instructions.
our interactions with the physical world. Information is
confined traditionally on paper or digitally on a screen. For example, drawing a magnifying glass symbol takes
SixthSense bridges this gap, bringing intangible, digital the user to the map application or drawing an ‘@’ symbol
information out into the tangible world, and allowing us lets the user check his mail. The SixthSense system also
to interact with this information via natural hand gestures. augments physical objects the user is interacting with by
‘SixthSense’ frees information from its confines by projecting more information about these objects projected
seamlessly integrating it with reality, and thus making the on them. For example, a newspaper can show live video
entire world your computer. news or dynamic information can be provided on a
The SixthSense prototype is comprised of a pocket regular piece of paper. The gesture of drawing a circle on
projector, a mirror and a camera. The hardware the user’s wrist projects an analog watch

.

Basic Components
execution of both Gesture tracking engine and Gesture
It consists of certain commonly available components, enabled application
which are intrinsic to its functioning. These include a
camera, a portable battery-powered projection system Colored Markers
coupled with a mirror and a cell phone. All these Marking the tip of user’s fingers with red, yellow, green,
components communicate to the cell phone, which acts as and blue tape helps the webcam recognize gestures. The
the communication and computation device. The entire movements and arrangements of these makers are
hardware apparatus is encompassed in a pendant-shaped interpreted into gestures that act as interaction
mobile wearable device. Basically the camera recognises instructions for the projected application interfaces.
individuals, images, pictures, gestures one makes with
their hands and the projector assists in projecting any Software Setup
information on whatever type of surface is present in
front of the person. The usage of the mirror is significant Applications are implemented using JAVA 2 MICRO
as the projector dangles pointing downwards from the edition, a Java platform designed for embedded systems
neck. To bring out variations on a much higher plane, in where target devices range from industrial controls to
the demo video which was broadcasted to showcase the mobile phones.Computer vision library is written in
prototype to the world, Mistry uses coloured caps on his symbian C++ (used in Gesture tracking).The software for
fingers so that it becomes simpler for the software to the sixth sense prototype is developed on a Microsoft
differentiate between the fingers, demanding various Windows platform using C#, WPF and open CV.The
applications. The software program analyses the video software works on the basis of computer vision.A small
data caught by the camera and also tracks down the camera acting as an eye, connecting us to the world of
locations of the coloured markers by utilising single digital information. Processing is happening in the
computer vision techniques. One can have any number of mobile phone, and basically works on computer vision
hand gestures and movements as long as they are all algorithms.Approx 50,000 lines of code are used
reasonably identified and differentiated for the system to
interpret it, preferably through unique and varied Kinds Of Gestures Recognized
fiducials. This is possible only because the ‘Sixth Sense’
device supports multi-touch and multi-user interaction. MULTI-TOUCH GESTURES are like the ones we see
in the iphone – where we touch the screen and make the
map move by pinching and dragging.
Hardware Setup FREEHAND GESTURES are like when you take a
picture or a namaste gesture to start the projection on the
Camera wall. ICONIC GESTURES drawing an icon in the air.
Captures an object in view and tracks the user’s hand Like, Whenever we draw a star, show us the weather
gestures. It sends the data to smart phone. It acts as a details. When we draw a magnifying glass, show us the
digital eye, connecting you to the world of digital map. This system is very customizable. We can make our
information. own gesture which our sixth sense device can understand.
We can change the Sixth Sense to our need.
Projector
The Projector projects visual information enabling
surfaces and physical objects to be used as interfaces. Applications
The project itself contains a battery inside, with 3 hours
of battery life. A tiny LED projector displays data sent Make a call
from the smart phone on any surface in view–object, The Sixth Sense prototype can be used to project a
wall, or person. Pocket projector Pk101 from Optoma is keypad onto your hand and then use that virtual keypad
used. It is Suitable for mobile usage to make a call.
Call up a map
Mirror With the map application we can call up the map of our
The usage of the mirror is significant as the projector choice and then use thumbs and index fingers to navigate
dangles pointing downwards from the neck. The mirror is the map
used to focus projections on surface. Time details
The user can draw a circle on your wrist to get a virtual
Mobile Component watch that gives you the correct time
A Web-enabled smart phone in the user’s pocket Multimedia reading experiences
processes the video data. Other software searches the Sixth Sense can enrich a user’s multimedia experiences.
Web and interprets the hand gestures .Nokia n95 smart It can be programmed to project related videos onto
phone is used (running Symbian O.S s60 edition). It has newspaper articles you are reading.
multitasking capability. Built-in camera provides
Drawing applications Gesture recognition
The drawing application lets the user draw on any surface
by tracking the fingertip movements of the user’s index Gesture recognition is a topic in computer science and
finger language technology with the goal of interpreting human
Zooming features gestures via mathematical algorithms. Gestures can
The user can zoom in or zoom out using intuitive hand originate from any bodily motion or state but commonly
movements originate from the face or hand. Current focuses in the
Access book information field include emotion recognition from the face and hand
The system can project Amazon ratings on that book, as gesture recognition. Many approaches have been made
well as reviews and other relevant information using cameras and computer vision algorithms to
Access product information interpret sign language. However, the identification and
Sixth Sense uses image recognition or marker technology recognition of posture, gait, proxemics, and human
to recognize products we pick up, and then feeds us behaviors is also the subject of gesture recognition
information on those products techniques.[1]
Flight updates
The system will recognize your boarding pass and let you Gesture recognition can be seen as a way for computers
know whether your flight is on time and if the gate has to begin to understand human body language, thus
changed building a richer bridge between machines and humans
Take pictures than primitive text user interfaces or even GUIs
If you fashion your index fingers and thumbs into a (graphical user interfaces), which still limit the majority
square ("framing" gesture), the system will snap a photo. of input to keyboard and mouse. Gesture recognition
After taking the desired number of photos, we can project enables humans to interface with the machine (HMI) and
them onto a surface, and use gestures to sort through the interact naturally without any mechanical devices. Using
photos, and organize and resize them. the concept of gesture recognition, it is possible to point a
finger at the computer screen so that the cursor will move
Related Technologies accordingly. This could potentially make conventional
input devices such as mouse, keyboards and even touch-
Augmented reality screens redundant. Gesture recognition can be conducted
with techniques from computer vision and image
Augmented reality (AR) is a term for a live direct or processing. The literature includes ongoing work in the
indirect view of a physical real-world environment whose computer vision field on capturing gestures or more
elements are augmented by virtual computer-generated general human pose and movements by cameras
sensory input such as sound or graphics. It is related to a connected to a computer.[2][3][4][5]Gesture recognition and
more general concept called mediated reality in which a pen computing In some literature, the term gesture
view of reality is modified (possibly even diminished recognition has been used to refer more narrowly to non-
rather than augmented) by a computer. As a result, the text-input handwriting symbols, such as inking on a
technology functions by enhancing one’s current graphics tablet, multi-touch gestures, and mouse gesture
perception of reality. recognition. This is computer interaction through the
In the case of Augmented Reality, the augmentation is drawing of symbols with a pointing device cursor.
conventionally in real-time and in semantic context with
environmental elements, such as sports scores on TV Computer vision
during a match. With the help of advanced AR
technology (e.g. adding computer vision and object Computer vision is the science and technology of
recognition) the information about the surrounding real machines that see, where see in this case means that the
world of the user becomes interactive and digitally machine is able to extract information from an image that
usable. Artificial information about the environment and is necessary to solve some task. As a scientific discipline,
the objects in it can be stored and retrieved as an computer vision is concerned with the theory behind
information layer on top of the real world view. The term artificial systems that extract information from images.
augmented reality is believed to have been coined in The image data can take many forms, such as video
1990 by Thomas Caudell, an employee of Boeing at the sequences, views from multiple cameras, or multi-
time[1].Augmented reality research explores the dimensional data from a medical scanner. As a
application of computer-generated imagery in live-video technological discipline, computer vision seeks to apply
streams as a way to expand the real-world. Advanced its theories and models to the construction of computer
research includes use of head-mounted displays and vision systems. Examples of applications of computer
virtual retinal displays for visualization purposes, and vision include systems for: Controlling processes (e.g., an
construction of controlled environments containing any industrial robot or an autonomous vehicle).,Detecting
number of sensors and actuators. events (e.g., for visual surveillance or people counting).,
Organizing information e.g., for indexing databases of digital information, bringing intangible information into
images and image sequences, Modeling objects or the tangible world.
environments e.g., industrial inspection, medical image Sixth Sense recognizes the objects around us,
analysis and Interaction (e.g., as the input to a device for displaying information automatically and
computer-human interaction). letting us to access it in any way we need.The
Sixth Sense prototype implements several
applications that demonstrate the usefulness,
Computer vision is closely related to the study of viability and flexibility of the system. Allowing
biological vision. The field of biological vision studies us to interact with this information via natural
and models the physiological processes behind visual hand gestures. The potential of becoming the
perception in humans and other animals. Computer ultimate "transparent" user interface for
vision, on the other hand, studies and describes the accessing information about everything
processes implemented in software and hardware behind around us. The SixthSense prototype implements
artificial vision systems. Interdisciplinary exchange several applications that demonstrate the usefulness,
between biological and computer vision has proven viability and flexibility of the system.
fruitful for both fields. Computer vision is, in some ways,
the inverse of computer graphics. While computer
graphics produces image data from 3D models, computer References
vision often produces 3D models from image data. There 1 http://www.technologyreview.com/TR35/Profile.asp
is also a trend towards a combination of the two x?TRID=816
disciplines, e.g., as explored in augmented reality. Sub-
domains of computer vision include scene reconstruction, 2 Power point presentationon SIXTH SENSE
event detection, video tracking, object recognition, TECHNOLOGY by Sandeep.s 4pa05ec091
learning, indexing, motion estimation, and image
http://www.scribd.com/doc/30495435/Sixth-Sense-
restoration.
Radio Frequency Identification Technology#open_download

Radio frequency identification (RFID) is a generic term 3 http://www.pranavmistry.com/projects/sixthsense/


that is used to describe a system that transmits the
4 http://news.softpedia.com/news/Next-Gen-039-
identity (in the form of a unique serial number) of an
Sixth-Sense-039-Device-Created-at-MIT-
object or person wirelessly, using radio waves. It is
103879.shtml
basically an electronic tagging technology that allows the
detection, tracking of tags and consequently the objects 5 http://en.wikipedia.org/wiki/Augmented_reality
that they are affixed to. It's grouped under the broad
category of automatic identification technologies.RFID is 6 http://en.wikipedia.org/wiki/Gesture_recognition
in use all around us. If you have ever chipped your pet
with an ID tag, used EZPass through a toll booth, or paid 7 http://en.wikipedia.org/wiki/Computer_vision
for gas using SpeedPass, you've used RFID. In addition,
8 http://www.aimglobal.org/technologies/RFID/what_i
RFID is increasingly used with biometric technologies
s_rfid.asp
for security. Unlike ubiquitous UPC bar-code technology,
RFID technology does not require contact or line of sight 9 http://news.bbc.co.uk/2/hi/technology/7997961.stm
for communication. RFID data can be read through the
human body, clothing and non-metallic materials 10 http://boingboing.net/2009/11/12/sixth-sense-
technolo.html
Conclusion
11 http://gizmodo.com/5167790/sixth-sense-
Information is often confined to paper or computer technology-may-change-how-we-look-at-the-world-
screens. SixthSense frees data from these confines and
forever
seamlessly integrates information and reality. With the
miniaturization of computing devices, we are always
12 http://www.ted.com/talks/pattie_maes_demos_the_si
connected to the digital world, but there is no link
between our interactions with these digital devices and xth_sense.html
our interactions with the physical world. SixthSense
bridges this gap by augmenting the physical world with 13 http://theviewspaper.net/sixth-sense-technology-will-
revolutionize-the-world/
14 http://www.freshcreation.com/entry/sixth_sense_tech 27 Linda G. Shapiro and George C. Stockman (2001).
nology/ Computer Vision. Prentice Hall. ISBN 0-13-030796-
3.
15 Dana H. Ballard and Christopher M. Brown (1982).
Computer Vision. Prentice Hall. ISBN 0131653164. 28 Bernd Jähne (2002). Digital Image Processing.
http://homepages.inf.ed.ac.uk/rbf/BOOKS/BANDB/ Springer. ISBN 3-540-67754-2.
bandb.htm.
29 David A. Forsyth and Jean Ponce (2003). Computer
16 David Marr (1982). Vision. W. H. Freeman and Vision, A Modern Approach. Prentice Hall. ISBN 0-
Company. ISBN 0-7167-1284-9. 13-085198-1.

17 Azriel Rosenfeld and Avinash Kak (1982). Digital 30 Richard Hartley and Andrew Zisserman (2003).
Picture Processing. Academic Press. ISBN 0-12- Multiple View Geometry in Computer Vision.
597301-2. Cambridge University Press. ISBN 0-521-54051-8.

18 Berthold Klaus Paul Horn (1986). Robot Vision. MIT 31 Gérard Medioni and Sing Bing Kang (2004).
Press. ISBN 0-262-08159-8. Emerging Topics in Computer Vision. Prentice Hall.
ISBN 0-13-101366-1.
19 Olivier Faugeras (1993). Three-Dimensional
Computer Vision, A Geometric Viewpoint. MIT 32 Tim Morris (2004). Computer Vision and Image
Press. ISBN 0-262-06158-9. Processing. Palgrave Macmillan. ISBN 0-333-
99451-5.
20 Tony Lindeberg (1994). Scale-Space Theory in
Computer Vision. Springer. ISBN 0-7923-9418-6. 33 E. Roy Davies (2005). Machine Vision : Theory,
http://www.nada.kth.se/~tony/book.html. Algorithms, Practicalities. Morgan Kaufmann.
ISBN 0-12-206093-8.
21 James L. Crowley and Henrik I. Christensen (Eds.)
(1995). Vision as Process. Springer-Verlag. ISBN 3- 34 R. Fisher, K Dawson-Howe, A. Fitzgibbon, C.
540-58143-X and ISBN 0-387-58143-X. Robertson, E. Trucco (2005). Dictionary of
Computer Vision and Image Processing. John Wiley.
22 Gösta H. Granlund and Hans Knutsson (1995). ISBN 0-470-01526-8.
Signal Processing for Computer Vision. Kluwer
Academic Publisher. ISBN 0-7923-9530-1. 35 Nikos Paragios and Yunmei Chen and Olivier
Faugeras (2005). Handbook of Mathematical Models
23 Reinhard Klette, Karsten Schluens and Andreas in Computer Vision. Springer. ISBN 0-387-26371-3.
Koschan (1998). Computer Vision - Three-
Dimensional Data from Images. Springer, Singapore. 36 Wilhelm Burger and Mark J. Burge (2007). Digital
ISBN 981-3083-71-9. Image Processing: An Algorithmic Approach Using
Java. Springer. ISBN 1846283795 and ISBN
24 Emanuele Trucco and Alessandro Verri (1998). 3540309403. http://www.imagingbook.com/.
Introductory Techniques for 3-D Computer Vision.
Prentice Hall. ISBN 0132611082. 37 Pedram Azad, Tilo Gockel, Rüdiger Dillmann
(2008). Computer Vision - Principles and Practice.
25 Milan Sonka, Vaclav Hlavac and Roger Boyle Elektor International Media BV. ISBN 0905705718.
(1999). Image Processing, Analysis, and Machine http://ivt.sourceforge.net/book.html.
Vision. PWS Publishing. ISBN 0-534-95393-X.

26 Bernd Jähne and Horst Haußecker (2000). Computer


Vision and Applications, A Guide for Students and
Practitioners. Academic Press. ISBN 0-12-379777-
2.

You might also like