You are on page 1of 15

Not so long ago, future-minded people enjoyed making a sharp distinction between "reality" (the often tiresome, problematic

world around us) and "cyberspace" (the promising new "virtual" world inside computers). Back in 1996, when the Web was still a novelty and relatively few people were online, John Perry Barlow famously declared that cyberspace is "a civilization of the Mind. Some years down the line, it's clear that black-and-white distinctions between real and virtual were incredibly wide of the mark. What most people actually want from computers is much more pragmatic: intuitively easy-to-use technology that will enhance their busy, everyday, real lives to help them find a decent coffee shop, to make shopping less of a chore, to meet new friends who share their interests, or whatever it might be. Now computers are smaller and more portable than ever, and you can go online almost anywhere on the planet, using online information to enhance ("augment") real life is where the smart money is heading. Augmented reality, as it is known, is something we'll be hearing an awful lot more about in the next few years.

Definition Augmented reality (AR) is a live, direct or indirect, view of a physical, real-world environment whose elements are augmented by computer-generated sensory input such as sound, video, graphics or GPS data. As a result, the technology functions by enhancing ones c urrent perception of reality. AR is a variation of Virtual Environments (VE), or Virtual Reality (VR) as it is more commonly called. VE technologies completely immerse a user inside a synthetic environment. While immersed, the user cannot see the real world around him. In contrast, AR allows the user to see the real world, with virtual objects superimposed upon or composited with the real world. Therefore, AR supplements reality, rather than completely replacing it. Ideally, it would appear to the user that the virtual and real objects coexisted in the same space. Figure 1 shows an example of what this might look like. It shows augmented reality map on the iPhone. Users can navigate around using an augmented reality interface that overlays all kinds of data relevant to their immediate location, as well as turn-by-turn traveling directions from one location to another. AR can be thought of as the "middle ground" between VE (completely synthetic) and telepresence (completely real).

Fig. 1 Ronald Azuma in 1997 identified three common characteristics of AR scenes: combination of the real and virtual, interactive in real-time, and having the scenes registered in 3D. Real life examples of AR: You're watching a tennis grand-slam on TV and there's a controversial call from one of the line judges. Was that serve in or out? The TV station runs an instant replay with a computer animation showing the exact trajectory of the ball and where it landed just outside the line. Then a little table comes up on the screen showing how many serves have been in or out for each player and how the figures have changed over the course of the match. You're a fighter pilot flying over a warzone with anti-aircraft fire shooting up at you. You really have to concentrate and looking down at all the gauges on your instrument panel is a distraction you can do without. Fortunately, you're wearing what's called a heads-up display (HUD), a set of goggles with built-in, miniaturized computers that automatically project instrument readings so they "float" in front of your eyes. You can find out everything you need to know without taking your eyes off the sky. You're walking the streets of London, England and you suddenly come across an amazing bit of architecture. What is this fantastic building? Who was the architect? Is that really titanium? You're dying to find out more, but the building is closed and there's no information about it at all. So you hold your cell phone up and take a quick photo. The phone uses its built-in GPS (satellite navigation) system to figure out roughly where you are, then quickly searches Google

Images to find similar photos taken in the same neighborhood. In a couple of seconds, it's identified the building and brought up a Wikipedia page telling you all about it. Cell phone applications such as Layar, Wikitude, and Yelp. With Layer, you simply look through your cellphone camera at the world in front of you and see layers of extra information, like transparent webpages, added on top. Wikitude offers an application called World Browser, which overlays useful information about landmarks and other points of interest seen through a camera phone. Yelp gives you instant access to reviews of shops, restaurants, hotels and so on .

Hardware The main hardware components for augmented reality are: processor, display, sensors and input devices. These elements, specifically CPU, display, camera and MEMS sensors such as accelerometer, GPS, solid state compass are often present in modern smartphones, which make them prospective AR platforms. Display Head mounted (optical, video) Handheld or screen based Spatial (projected) Retinal display Tracking Tracking allows augmented reality applications to properly render the virtual components of a scene as the users point of view shifts. Virtual objects should not follow the users gaze around a scene, unless that is the intent of the designers. Likewise, virtual objects should not tilt if the user tilts his or her head. Tracking data is used by the virtual reality application to make sure that the virtual and real components of a scene align properly regardless of the position of the users head and the di rection of gaze. Tracking can be achieved with a wide variety of different technologies that are based on different physical principles. Mechanical, magnetic, acoustic and optical tracking approaches are commonly used.

Not all augmented reality applications require precise tracking. Popular modern applications, such as those that employ cell phones, need only be concerned with aligning virtual content with the cameras view of the scene. Modern mobile augmented reality systems use one or more of the foll owing tracking technologies: digital cameras and/or other optical sensors, accelerometers, GPS, gyroscopes, solid state compasses, RFID and wireless sensors. Input devices Techniques include the pinch glove, a wand with a button and a smartphone that signals its position and orientation from camera images. Computer The computer analyzes the sensed visual and other data to synthesize and position augmentations. Software Early augmented reality researchers had to write all of their software from scratch. In recent years, frameworks for augmented reality software development have emerged, making the creation of robust applications possible in a fraction of the time and with far fewer bugs and errors than the early adapters faced. The most popular frameworks in augmented reality software development are: ARtoolkit Dfusion buildAR

In order for AR to reach its full potential, several technological advancements are required: 1. Markerless object recognition needs to mature: Most AR applications currently use markers that look like small barcodes, requiring users to either print the markers themselves or attach them to objects, or else limiting users to only interact with objects that already have markers on them. Markerless AR enables applications to recognize objects without markers and thus significantly improves user experience. Microsofts Kinect is the best -known and most successful commercial example of

markerless AR, and startups like Organic Motion have created very impressive prototypes for creating real-time AR avatars.

2. Image processing needs to become faster and more accurate: Currently, it is fairly easy for a computer to recognize a human face in a frontal portrait shot and distinguish it from a very different object say a car or an apple. However, it is much harder for a computer to tell the difference between similar-looking people, especially if they arent facing the camera or are wearing accessories. The same difficulties apply to detecting buildings and other natural landmarks. To correctly identify objects, computers need to build a 3D model of the world to understand how objects look from every possible angle a very complex and resources intensive operation. 3. Mobile Hardware needs to become more powerful: Truly immersive AR requires computers to be able to replace objects with matching 3D renderings of virtual objects in real -time. Mobile hardware is not powerful enough yet for this but will be soon. For example, the iPhone 4G runs on an A4 Apple processor at approximately 1GHz, similar to the speed of common desktop computers from 7 years ago. Mobile devices are likely to reach the processing power of todays desktops which are sufficient for advanced AR applications in another 2-3 years. Applications Augmented reality has many applications, and many areas can benefit from the usage of AR technology. At the beginning AR had a military, industrial, and medical focus but soon it was introduced for commercial and entertainment usage. Personal information system - wearable gestural interface to utilize information by means of a tiny projector and a camera mounted on a collar of a human Personal Assistance- Personal Awareness Assistant, Speech recognition also provides a natural interface to retrieve the information that was recorded earlier. Advertisement - On a larger scale, AR techniques for augmenting for instance deformable surfaces like cups and shirts and environments also present direct marketing agencies with many opportunities to offer coupons to passing pedestrians, place virtual billboards, show virtual prototypes, etc. Navigation - Augmenting a video stream from a hand held camera using fiducial markers for position tracking for indoor navigation.

Touring - AR to create situated documentaries about historic events, reconstructing a cultural heritage site so that visitors/students can view and learn ancient architecture and customs. Industrial (Design) Designing workspace that allows for instance visualization and modification of car body curvature and engine layout. Create Clear and Present Car, a simulation where one can open the door of a virtual concept car and experience the interior, dash board lay out and interface design for usability testing Industrial (Assembly) - BMW used AR to improve welding processes on their cars. AR in construction to analyze interfering edges, plan production lines and workshops, compare variance and verify parts of a car (Volkswagen). Industrial (Maintenance) - Complex machinery or structures require a lot of skill from maintenance personnel and AR is proving useful in this area, for instance in providing x -ray vision or automatically probing the environment with extra sensors to direct the users attention to problem sites. Electrical troubleshooting of vehicles at Ford and to assist their technicians with vehicle history and repair information. Medical applications - Nurses and doctors could benefit from important information being delivered directly to their glasses like laparoscopic surgery where the overlaid view of the laparoscopes inserted through small incisions is simulated. C-arm x-ray machine to automatically calibrate the cameras with the machine and register the x ray imagery with the real objects. Use of video see-through HMD to overlay MR scans on heads and provide views of tool manipulation hidden beneath tissue and surfaces. Intraoperative augmented reality applied to laparoscopic right adrenalectomy Augmented patient dummies with haptic feedback invoking the same behavior by specialists as with real patients. Advertisement - AR can also serve advertisers to show virtual ads and product placements. Sports Broadcasting - Swimming pools, football fields, race tracks and other sports environments are well-known and easily prepared, which video see-through augmentation through tracked camera feeds easy like AR is also applied to annotate racing cars, snooker ball trajectories, life swimmer performances, etc. Office - Public management or crisis situations, urban planning, etc. Combat and Simulation - Satellite navigation, heads-up displays for pilots. Navigational support, communications enhancement, repair and maintenance and emergency medicine. Military training in large-scale combat scenarios and simulating real-time enemy action, as in the Battlefield Augmented Reality System (BARS). Military (Aircraft) - Using AR, it is possible to visualize installations that are vital for airworthiness behind a damaged external skin, thus enabling an informed decision for safe take-off.

In museums - using augmented reality to create a virtual view of the real world that can be extended with graphics and other content. When visitors point their smartphones at markers throughout the museum, the dinosaurs come to. In the Science Museum in Londons Making of the Modern World exhibit, augmented reality also takes up the mantle of education. In Book stores Finding books and getting suggestions Architecture AR makes the models of buildings come alive. No more mini models of buildings required. Fashion & Apparel Like the outfit of the lady walking past you on the street? No longer will you be left wondering for the rest of the day just where she got that stylish trench coat from! Using augmented reality, users will hopefully one day be able to scan and purchase anything they see. This concept can similarly be applied to other product categories including automotive, CPG and more. Toys & Games Legos Digital Box display units use augmented reality to illustrate what finished Lego kits will look like. All a customer has to do is hold up the product box in front of an in-store kiosk display, and the screen will show a 360-degree view of what the finished product will look like. Food & Drink - Stella Artois Le Bar Guide is an iPhone App which combines geo-location with augmented reality. Augmented reality markers show up on your iphone showing you the nearest location to a bar which serves Stella Artois beer. This concept with the AR map-markers could be applied to anything which can be located restaurants, cafes, retail shops, taxi cabs, etc. Police Patrolling: Real-time language translation along with data on cultural customs and traditions, Real-time intelligence about crimes and criminals in the patrol area, Facial, voice-print and other biometric recognition data of known criminals to allow instantaneous identification, Integration of chemical, biological and explosive sensors to immediately notify officers of any local contamination and recommend appropriate protective measures for themselves and the public, Scalable, three dimensional maps, complete with building floor plans, sewer system schematics, public utility information and public transportation routes, accessed at will to improve situational awareness and response to problems, and Patrol car operator data and regional traffic management information on a heads-up display to make driving safer and more efficient, especially in pursuit and rapid response situations. SWAT Operations: Improved situational awareness during dynamic and dangerous incidents, enhanced cohesiveness between team members and better coordination with command personnel to make SWAT operations safer and more effective, In tactical situations, modulates the audio effects of gunshots (both hostile and friendly) to enhance concentration while providing the user with superior hearing capabilities over long distances, Advanced optics to provide zoom, thermal and infrared imaging for the location and apprehension of fleeing criminals, buried or concealed

disaster survivors, or missing persons, Identification Friend or Foe (IFF) technology, worn by every police officer, to reduce or eliminate friendly fire casualties by visually, audibly and/or haptically highlighting fellow police officers both on and off-duty, and Interface of human-machine components that extend human capabilities and presence to remote locations. Criminal Investigation: Enhanced ability to gather information, follow leads and visualize large amounts of data in real-time to solve crimes and more quickly identify and capture dangerous criminals and terrorists, Speaker recognition capability to provide investigators the ability to accurately match voices against known criminals, Advanced optics to allow i nvestigators to lip-read from great distances in situations where listening devices are impractical, The use of AR video, audio and sensing devices used to visualize blood patterns, blood stains and other sensor-detectable forensic data available at crime scenes, Automatic sensor readings that calculate distance and height and directly create digital and AR maps for court presentation and The coordinated use of robots, unmanned aerial vehicles (UAVs) and police officers managed through an AR network to enhance surveillance activities. Training: Realistic training scenarios to simulate dangerous police environments while blending realworld equipment and fellow trainees into the scenario. Supervision: Real-time monitoring of patrol activities through the use of a video/audio feed from the street, Display of location, activity and status information projected on a 3 dimensional map of the community, Supervision of critical incident response to include the monitoring of the physiological status of all personnel, permitting the assignment of dangerous tasks to those who are mentally and physically best able to perform and Coordination of widely dispersed units through the use of visual, audible and haptic cues from the supervisor. Therapy: The following activities involved in the therapy can make use of AR: Creative Visualisation Also known as Guided Imagery, Art - Drawing and Painting, Therapeutic Storytelling, Sand tray /Sand Worlds. AR Gaming: AR games could be physical to the same extent as real world games, with the added advantage that game content can be injected seamlessly into the real world. Current AR equipment is usually awkward and cabled as technology is improved and becomes more of a consumer commodity, this will most likely cease to be a problem. Todays AR hardware cannot affect players physically, but there may be solutions in haptic feedback and other emerging technologies.

All over the world in the month of November 2010 Microsoft launched its new hardware device for the xbox 360, that is the Kinect. Microsoft xbox 360 Kinect is a sensor device where you act as a controller when you play games or we can say it is a controller free gaming device. Kinect is a motion sensing input device by Microsoft for the Xbox 360 video game console and Windows PCs. Based around a webcam-style add-on peripheral for the Xbox 360 console, it enables users to control and interact with the Xbox 360 without the need to touch a game controller, through a natural user interface using gestures and spoken commands. After selling a total of 8 million units in its first 60 days, the Kinect holds the Guinness World Record of being the "fastest selling consumer electronics device .

Microsoft xbox 360 Kinect technical specification (2010) Sensor & Camera

Color and depth sensing lenses Color VGA motion camera 640 x 480 pixel resolution @30 FPS Depth Camera 640 x 480 pixel resolution @30 FPS Voice microphone and array supporting single speaker voice recognition Tilt motor for sensor adjustment

Field of view

Horizontal field of view- 57 degrees Vertical field of view- 43 degrees

Physical tilt range- (-27 degrees to +27 degrees)

Depth sensor range- 1.2 m to 3.5 m

Data Streams

320240 16-bit depth @ 30 frames/sec 640480 32-bit colour@ 30 frames/sec 16-bit audio @ 16 kHz

Skeletal tracking system


Tracks up to 6 people, including 2 active players Tracks 20 joints per active player

This is a Prime Sense diagram explaining how their reference platform works. The Kinect is the first (and only) implementation of this platform. One camera (and one IR transmitter) provide input for the depth map (rumored to be just 320x240), while the third camera detects the human visual spectrum at 640x480 resolution.

Motion capture, motion tracking, or mocap are terms used to describe the process of recording movement of one or more objects or persons. It is used in military, entertainment, sports, and medical applications, and for validation of computer visionand robotics. In filmmaking, and games, it refers to recording actions of human actors, and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. The device features an "RGB camera, depth sensor and multi-array microphone running proprietary software, which provide full-body 3D motion capture, facial recognition and voice recognition capabilities. Kinect for Windows On February 21, 2011 Microsoft announced that it would release a non-commercial Kinect software development kit (SDK) for Windows in spring 2011, which was released for Windows 7 on June 16, 2011 in 12 countries. The SDK includes Windows 7 compatible PC drivers for Kinect device. It provides Kinect capabilities to developers to build applications with C++, C#, or Visual Basic by using Microsoft Visual Studio 2010 and includes following features: Raw sensor streams: Access to low-level streams from the depth sensor, color camera sensor, and fourelement microphone array. Skeletal tracking: The capability to track the skeleton image of one or two people moving within the Kinect field of view for gesture-driven applications. Advanced audio capabilities: Audio processing capabilities include sophisticated acoustic noise suppression and echo cancellation, beam formation to identify the current sound source, and integration with the Windows speech recognition API. Kinect for Windows 1.5 In March 2012, Microsoft announced that next version of the Kinect for Windows would be available in May 2012. Kinect for Windows 1.5 was released on May 21, 2012. It adds new features, support for many new languages and debut in 19 more countries.

The Kinect for Windows 1.5 SDK woud include 'Kinect Studio' a new app that allows developers to record, playback, and debug clips of users interacting with applications. Support for new "seated" or "10-joint" skeletal system that will let apps track the head, neck, and arms of a Kinect user - whether they're sitting down or standing; which would work in default and near mode. Support for four new languages for speech recognition French, Spanish, Italian, and Japanese. Additionally it would add support for regional dialects of these languages along with English. It would be available in Hong Kong, South Korea, and Taiwan in May and Austria, Belgium, Brazil, Denmark, Finland, India, the Netherlands, Norway, Portugal, Russia, Saudi Arabia, Singapore, South Africa, Sweden, Switzerland and the United Arab Emirates in June. Software Requiring at least 190 MB of available storage space, the Kinect system software allows users to operate the Xbox 360 Dashboard console user interface through voice commands and hand gestures. Techniques such as voice recognition and facial recognition are employed to automatically identify users. Among the applications for Kinect is Video Kinect, which enables voice chat or video chat with other Xbox 360 users or users of Windows Live Messenger. The application can use Kinect's tracking functionality and the Kinect sensor's motorized pivot to keep users in frame even as they move around. Kinect for Windows consists of the Kinect for Windows hardware and Kinect for Windows SDK, which supports applications built with C++, C#, or Visual Basic by using Microsoft Visual Studio 2010. The Kinect for Windows SDK version 1.5 offers seated skeletal and facial tracking, new tools, and advanced speech recognition capabilities. Also available is a Kinect for Windows Developer Toolkit to help speed up development of applications using the Kinect natural user interface. New in the May 2012 SDK Release Seated mode skeletal tracking Provides the ability to track users upper body (10-joint) and overlook the lower body if not visible or relevant to application. In addition, enables the identification of user when sitting on a chair, couch, or other inanimate object.

Improved skeletal tracking In near range, users who are seated or standing can be tracked within 40 cm (16 inches) of the sensor. Plus, the skeletal tracking engine is now faster, making better use of the CPU and scaling of computer resources. In addition, newly added joint orientation information for skeletons is ideal for avatar animation scenarios and simple pose detection. Face tracking capabilities Makes it possible to fit a 3-D mesh to users faces and track their facial features and head position in real time by using components from the Developer Toolkit. Developer toolkit Provides components, libraries, tools, samples, Kinect Studio, and other resources to make it easier to develop applications by using the Kinect for Windows SDK.

Kinect Studio Simplifies the review and testing of applications by using Kinect Studio to record, playback, and debug Kinect data. Speech recognition options In addition to English, Kinect Studio enables French, Spanish, Italian, and Japanese speech recognition and offers language packs to help you develop applications that recognize the way a language is spoken in different regions: English/Great Britain, English/Ireland, English/Australia, English/New Zealand, English/Canada, French/France, French/Canada, Italian/Italy, Japanese/Japan, Spanish/Spain, and Spanish/Mexico. To use a Kinect for Windows sensor you will need a PC with the following:

Windows 7, Windows 8 Consumer Preview, Windows Embedded Standard 7, or Windows Embedded POSReady 7.

32 bit (x86) or 64 bit (x64) processor

Dual-core 2.66-GHz or faster processor Dedicated USB 2.0 bus 2 GB RAM

You might also like