You are on page 1of 24

A Seminar report

On

ARTIFICIAL PASSENGER

ABSTRACT
An artificial the driver. The
passenger (AP) is a device conversation would be
that would be used in a based on a personalized
motor vehicle to make sure profile of the driver. A
that the driver stays awake. camera could be used to
IBM has developed a evaluate the driver's "facial
prototype that holds a state" and a voice analyzer
conversation with a driver, to evaluate whether the
telling jokes and asking driver was becoming
questions intended to drowsy. If a driver seemed
determine whether the to display too much
driver can respond alertly fatigue, the artificial
enough. Assuming the passenger might be
IBM approach, an artificial programmed to open all
passenger would use a the windows, sound a
microphone for the driver buzzer, increase
and a speech generator and background music volume,
the vehicle's audio or even spray the driver
speakers to converse with with ice water. One of the
ways to address driver Understanding (NLU) for
safety concerns is to CIT is a difficult problem
develop an efficient system that typically requires
that relies on voice instead significant computer
of hands to control resources that are usually
Telematics devices. not available in local
One of the ways to computer processors that
reduce a driver’s cognitive car manufacturer provide
workload is to allow the for their cars. To address
driver to speak naturally this, NLU components
when interacting with a car should be located on a
system (e.g. when playing server that is accessed by
voice games, issuing cars remotely or NLU
commands via voice). It is should be downsized to run
difficult for a driver to on local computer devices
remember a syntax, such (that are typically based on
as "What is the distance to embedded chips).Some car
JFK?""Or how far is manufacturers see
JFK?" or "How long to advantages in using
drive to JFK?" etc.). This upgraded NLU and speech
fact led to the development processing on the client in
of Conversational the car, since remote
Interactivity for Telematics connections to servers are
(CIT) speech systems at not available everywhere,
IBM Research. CIT speech can have delays, and are
systems can significantly not robust. Our department
improve a driver-vehicle is developing a “quasi-
relationship and contribute NLU”component - a
to driving safety. But the “reduced” variant of NLU
development of full that can be run in CPU
fledged Natural Language
systems with relatively limited resources.

INTRODUCTION
US
Studies of road safety increase in popularity of
found that human error Telematics services in cars
was the sole cause in more (like navigation, cellular
than half of all telephone, internet access)
accidents .One of the there is more information
reasons why humans that drivers need to process
commit so many errors lies and more devices that
in the inherent limitation of drivers need to control that
human information might contribute to
processing .With the additional driving errors
.
ER TECHNOLOGIES
ARTIFICIAL PASSENGER OVERVIEW
The AP is an provocative questions such
artificial intelligence– “Who was the first person
based companion that will you dated?” via a speech
be resident in software and generator and in-car
chips embedded in the speakers.
automobile dashboard. The A microphone
heart of the system is a picks up your answer and
conversation planner that breaks it down into
holds a profile of you, separate words with
including details of your speech-recognition
interests and profession. software. A camera built
When activated, the AP into the dashboard also
uses the profile to cook up tracks your lip movements
to improve the accuracy of even further object of the
the speech recognition. A present invention is to
voice analyzer then looks provide a natural dialog car
for signs of tiredness by system that understands
checking to see if the content of tapes, books,
answer matches your and radio programs and
profile. Slow responses extracts and reproduces
and a lack of intonation are appropriate phrases from
signs of fatigue. If you those materials while it is
reply quickly and clearly, talking with a driver. For
the system judges you to example, a system can find
be alert and tells the out if someone is singing
conversation planner to on a channel of a radio
continue the line of station.
questioning. If your The system will
response is slow or doesn’t state, “And now you will
make sense, the voice hear a wonderful song!” or
analyzer assumes you are detect that there is news
dropping off and acts to and state, “Do you know
get your attention. what happened now—hear
The system, the following and play
according to its inventors, some news.” The system
does not go through a suite also includes a recognition
of rote questions system to detect who is
demanding rote answers. speaking over the radio
Rather, it knows your and alert the driver if the
tastes and will even, if you person speaking is one the
wish, make certain you driver wishes to hear.” Just
never miss Paul Harvey because you can express
again. This is from the the rules of grammar in
patent application: “An software doesn’t mean a
driver is going to use them. “Sorry, I didn’t get it.
The AP is ready for that Could you say it briefly?”
possibility: It provides for Here, the system defines a
a natural dialog car system narrow topic of the user
directed to human factor reply (answer or question)
engineering for example, via an association of
people using different classes of relevant words
strategies to talk (for via decision trees. The
instance, short vs. system builds a reply
elaborate responses ). In sentence asking what are
this manner, the individual most probable word
is guided to talk in a sequences that could
certain way so as to make follow the user’s reply.”
the system work—e.g.,

WHY ARTIFICIAL PASSENGER?


IBM received a Additionally, the
patent in May for a sleep application said, “The
prevention system for use natural dialog car system
in automobiles that is, analyzes a driver’s answer
according to the patent and the contents of the
application, “capable of answer together with his
keeping a driver awake voice patterns to determine
while driving during a long if he is alert while driving.
trip or one that extends The system warns the
into the late evening. The driver or changes the topic
system carries on a of conversation if the
conversation with the system determines that the
driver on various topics driver is about to fall
utilizing a natural dialog asleep. The system may
car system.” also detect whether a
driver is affected by you, sound a buzzer, or
alcohol or drugs.” summarily roll down the
If the system thinks window. If those don’t do
your attention is flagging, the trick, the Artificial
it might try to perk you up Passenger (AP) is ready
with a joke. Alternatively, with a more drastic
the system might abruptly measure: a spritz of icy
change radio stations for water in your face.

FUNCTIONS OF ARTIFICIAL PASSENGER

 VOICE CONTROL INTERFACE


One of the ways to system (e.g. when playing
address driver safety voice games, issuing
concerns is to develop an commands via voice). It is
efficient system that relies difficult for a driver to
on voice instead of hands remember a complex
to control Telematics speech command menu
devices. It has been shown (e.g. recalling specific
in various experiments that syntax, such as "What is
well designed voice the distance to JFK?" or
control interfaces can "Or how far is JFK?" or
reduce a driver’s "How long to drive to
distraction compared with JFK?" etc.).
manual control situations. This fact led to the
One of the ways to development of
reduce a driver’s cognitive Conversational
workload is to allow the Interactivity for Telematics
driver to speak naturally (CIT) speech systems at
when interacting with a car IBM Research.. CIT
speech systems can not available everywhere,
significantly improve a can have delays, and are
driver-vehicle relationship not robust.
and contribute to driving A “quasi-NLU”
safety. But the component - a “reduced”
development of full variant of NLU is being
fledged Natural Language developed that can be run
Understanding (NLU) for in CPU systems with
CIT is a difficult problem relatively limited
that typically requires resources. In this approach,
significant computer possible variants for
resources that are usually speaking commands are
not available in local kept in special grammar
computer processors that files (one file for each
car manufacturers provide topic or application). When
for their cars. the system gets a voice
To address this, response, it searches
NLU components should through files (starting with
be located on a server that the most relevant topic). If
is accessed by cars it finds an appropriate
remotely or NLU should command in some file, it
be downsized to run on executes the command.
local computer devices Otherwise the system
(that are typically based on executes other options that
embedded chips). Some are defined by a Dialog
car manufacturers see Manager (DM) . The DM
advantages in using component is a rule based
upgraded NLU and speech sub-system that can
processing on the client in interact with the car and
the car, since remote external systems (such as
connections to servers are weather forecast services,
e-mail systems, telephone assumes that the driver
directories, etc.) and a means Yorktown, NY,
driver to reduce task from the earlier
complexity for the NLU conversational context.
system. The following are Also, when the driver asks
examples of conversations the inexplicit question
between a driver and DM “What about Ossining?” it
that illustrate some of tasks assumes that the driver is
that an advanced DM still asking about weather.)
should be able to perform: 3. Manage failure and
1. Ask questions (via a text provide contextual,
to speech module) to failure- dependent help
resolve ambiguities: and actions
- (Driver) Please, plot a - (Driver) When will we
course to Yorktown get there?
- (DM) Within - (DM) Sorry, what did
Massachusetts? you say?
- (Driver) No, in New - (Driver) I asked when
York will we get there. The
2. Fill in missing problem of instantaneous
information and remove data collection could be
ambiguous references dealt systematically by
from context: creating a learning
- (Driver) What is the transformation system
weather forecast for today? (LT).
- (DM) Partly cloudy, 50%
chance of rain Examples of LT tasks are
- (Driver) What about as follows:
Ossining? • Monitor driver and
- (DM) Partly sunny, 10% passenger actions in the
chance of rain (The DM car’s internal and external
environments across a • Learn from this data how
network; Safety Driver Manager
• Extract and record the Components and driver
Driver Safety Manager behavior could be
relevant data in databases; improved and adjusted to
• Generate and learn improve Driver Safety
patterns from stored data; Manager performance and
improve driving safety.

 EMBEDDED SPEECH RECOGNITION


Car computers are processor, augmented by 1
usually not very powerful MB or less of DRAM can
due to cost considerations. be found.
The growing necessity of Logically a speech
the conversational system is divided into three
interface demands primary modules: the
significant advances in front-end, the labeler and
processing power on the the decoder. When
one hand, and speech and processing speech, the
natural language computational workload is
technologies on the other. divided approximately
In particular, there is equally among these
significant need for a low- modules. The system uses
resource speech the familiar phonetically-
recognition system that is based, hidden Markov
robust, accurate, and model (HMM) approach.
efficient. An example of a The acoustic model
low-resource system that is comprises context-
executed by a 50 DMIPS
dependent sub-phone feature vectors and
classes (all phones). specifying the terminal
The context for a nodes of the tree as the
given phone is composed relevant instances of these
of only one phone to its classes. Each allophone is
left and one phone to its modeled by a single-state
right. The allophones are Hidden Markov Model
identified by growing a with a self loop and a
decision tree using the forward transition.
context-tagged training

FIG: Embedded speech recognition


indicator
The decoder of context- dependent
implements a synchronous phonemes, with each
Viterbi search over its phoneme modeled as a
active vocabulary, which three-state HMM. The
may be changed observation densities
dynamically. Words are associated with each HMM
represented as sequences state are conditioned upon
one phone of left context (ii) minimize the
and one phone of right likelihood of the training
context only. data given all possible
A discriminative Sequences of models
training procedure was allowed by the grammar
applied to estimate the describing the task.
parameters of these In 2001, speech
phones. MMI training evaluation experiments
attempts to simultaneously yields improvement from
(i) maximize the 20% to 40% relatively
likelihood of the training depending on testing
data given the sequence of conditions (e.g. 7.6% error
models corresponding to rate for 0 speed and 10.1%
the correct transcription, for 60 mph).
and

FIG: Embedded Speech Recognition


device
 DRIVER DROWSINESS PREVENTION
Fatigue causes by increasing arousal via
more than 240,000 speech interactivity.
vehicular accidents every It is a common
year. Currently, drivers experience for drivers to
who are alone in a vehicle talk to other people while
have access only to media they are driving to keep
such as music and radio themselves awake. The
news which they listen to purpose of Artificial
passively. Often these do Passenger part of the CIT
not provide sufficient project at IBM is to
stimulation to assure provide a higher level of
wakefulness. Ideally, interaction with a driver
drivers should be presented than current media, such as
with external stimuli that CD players or radio
are interactive to improve stations, can offer. This is
their alertness. envisioned as a series of
Driving, however, interactive modules within
occupies the driver’s eyes Artificial Passenger, that
and hands, thereby limiting increase driver awareness
most current interactive and help to determine if
options. Among the efforts the driver is losing focus.
presented in this general This can include both
direction, the invention conversational dialog and
suggests fighting interactive games, using
drowsiness by detecting voice only. The scenarios
drowsiness via speech for Artificial Passenger
biometrics and, if needed, currently include: quiz
games, reading jokes,
asking questions, and evidencing fatigue, for
interactive books. example, will be presented
In the Artificial with more stimulating
Passenger (ArtPas) content than drivers who
paradigm, the awareness- appear to be alert. This
state of the driver will be could enhance the driver
monitored, and the content experience, and may
will be modified contribute to safety.
accordingly. Drivers

 WORKLOAD MANAGER
In this section, a These sensors allow for the
brief analysis of the design monitoring of driver
of the workload actions (e.g. application of
management that is a key brakes, changing lanes),
component of driver Safety provide information about
Manager is provided. An local events (e.g. heavy
object of the workload rain), and provide
manager is to determine a information about driver
moment-to-moment characteristics (e.g.
analysis of the user's speaking speed, eyelid
cognitive workload. It status). There is also
accomplishes this by growing amount of
collecting data about user distracting information that
conditions, monitoring may be presented to the
local and remote events, driver (e.g. phone rings,
and prioritizing message radio, music, e-mail etc.)
delivery. There is rapid and actions that a driver
growth in the use of can perform in cars via
sensory technology in cars. voice control.
FIG: Condition Sensors
Device

The relationship driver. Values on each axis


between a driver and a car could conceivably run
should be consistent with from zero to one.
the information from Maximum load would be
sensors. The workload represented by the position
manager should be where there is both
designed in such a way maximum vehicle stress
that it can integrate sensor and maximum driver
information and rules on stress, beyond which there
when and if distracting would be “overload”.
information is delivered. The workload
This can be designed as a manager is closely related
“workload representational to the event manager that
surface”. One axis of the detects when to trigger
surface would represent actions and/or make
stress on the vehicle and decisions about potential
another, orthogonally actions. The system uses a
distinct axis, would set of rules for starting and
represent stress on the stopping the interactions
(or interventions). It the driver needs additional
controls interruption of a stimuli and on what types
dialog between the driver of stimuli should be
and the car dashboard (for provided (e.g. verbal
example, interrupting a stimuli via speech
conversation to deliver an applications or physical
urgent message about stimuli such as a bright
traffic conditions on an light, loud noise, etc.) and
expected driver route). It whether to suggest to a
can use answers from the driver to stop for rest.
driver and/or data from the The system permits
workload manager relating the use and testing of
to driver conditions, like different statistical models
computing how often the for interpreting driver
driver answered correctly answers and information
and the length of delays in about driver conditions.
answers, etc. It interprets The driver workload
the status of a driver’s manager is connected to a
alertness, based on his/her driving risk evaluator that
answers as well as on is an important component
information from the of the Safety Driver
workload manager. It will Manager.
make decisions on whether
Fig : Mobile indicator device

The goal of the situations. Situations could


Safety Driver Manager is be simple, complex or
to evaluate the potential abstract.
risk of a traffic accident by The concept
producing measurements associated with learning
related to stresses on the driver behavioral patterns
driver and/or vehicle, the can be facilitated by a
driver’s cognitive particular driver’s repeated
workload, environmental routines, which provides a
factors, etc. The important good opportunity for the
input to the workload system’s “learning”
manager is provided by the habitual patterns and goals.
situation manager whose So, for instance, the system
task is to recognize critical could assist in determining
situations. It receives as whether drivers are going
input various media (audio, to pick up their kids in
video, car sensor data, time by, perhaps,
network data, GPS, reordering a path from the
biometrics) and as output it cleaners, the mall, the
produces a list of grocery store, etc.
WORKING OF ARTIFICIAL PASSENGER
The AP is an answer matches your
artificial intelligence– profile. Slow responses
based companion that will and a lack of intonation are
be resident in software and signs of fatigue.
chips embedded in the If you reply quickly
automobile dashboard. The and clearly, the system
heart of the system is a judges you to be alert and
conversation planner that tells the conversation
holds a profile of you, planner to continue the line
including details of your of questioning. If your
interests and profession. response is slow or doesn’t
When activated, the AP make sense, the voice
uses the profile to cook up analyzer assumes you are
provocative questions via a dropping off and acts to
speech generator and in- get your attention.
car speakers. This is from the patent
A microphone application:
picks up your answer and “An even further
breaks it down into object of the present
separate words with invention is to provide a
speech-recognition natural dialog car system
software. A camera built that understands content of
into the dashboard also tapes, books, and radio
tracks your lip movements programs and extracts and
to improve the accuracy of reproduces appropriate
the speech recognition. A phrases from those
voice analyzer then looks materials while it is talking
for signs of tiredness by with a driver. For example,
checking to see if the a system can find out if
someone is singing on a driver wishes to hear.”
channel of a radio station. Driver fatigue causes at
The system will state, least 100,000 crashes,
“And now you will hear a 1,500 fatalities, and 71,000
wonderful song!” or detect injuries annually,
that there is news and state, according to estimates
“Do you know what prepared by the National
happened now—hear the Highway Traffic. “A
following—and play some majority of the off-road
news.” The system also accidents observed during
includes a recognition the driving simulations
system to detect who is were preceded by eye
speaking over the radio closures of one-half second
and alert the driver if the to as long as 2 to 3
person speaking is one the seconds,” Stern said.

FIG :Camera for detection of Lips


Movement

A normal human long eye closures are


blink lasts 0.2 to 0.3 detected, it’s too late to
second. Stern said he prevent danger. “To be of
believes that by the time much use,” he said, “alert
systems must detect early movements that signal
signs of fatigue, since the oncoming mental lapses—
onset of sleep is too late to sudden and unexpected
take corrective action.” short interruptions in
Stern and other researchers mental performance that
are attempting to pinpoint usually occur much earlier
various irregularities in eye in the transition to sleep.

FEATURES OF ARTIFICIAL PASSENGER

 CONVERSATIONAL TELEMATICS
IBM’s Artificial conditions and external
Passenger is like having a hazards with minimal
butler in your car— distraction. Plus, it helps
someone who looks after you stay awake with some
you, takes care of your form of entertainment
every need, is bent on when it detects you’re
providing service, and has getting drowsy. In time,
enough intelligence to the Artificial Passenger
anticipate your needs. This technology will go beyond
voice-actuated telemeters simple command-and-
system helps you perform control. Interactivity will
certain actions within your be key. So will natural
car hands-free: turn on the sounding dialog. For
radio, switch stations, starters, it won’t be
adjust HVAC, make a cell repetitive (“Sorry your
phone call, and more. It door is open, sorry your
provides uniform access to door is open . . .”). It will
devices and networked ask for corrections if it
services in and outside determines it
your car. It reports car misunderstood you. The
amount of information it how you adjust your seat.
provides will be based on Parts of this technology are
its “assessment of the 12 to 18 months away
driver’s cognitive load” from broad
(i.e., the situation). It can implementation.
learn your habits, such as

 IMPROVING SPEECH RECOGNITION


You’re driving at Voice does the speech
70 mph, it’s raining hard, a recognition. In places with
truck is passing, the car moderate noise, where
radio is blasting, and the conventional speech
A/C is on. Such noisy recognition has a 1% error
environments are a rate, the error rate of
challenge to speech AVSR is less than 1%. In
recognition systems, places roughly ten times
including the Artificial noisier, speech recognition
Passenger.IBM’s Audio has about a 2% error rate;
Visual Speech Recognition AVSR’s is still pretty good
(AVSR) cuts through the (1% error rate). When the
noise. It reads lips to ambient noise is just as
augment speech loud as the driver talking,
recognition. Cameras speech recognition loses
focused on the driver’s about 10% of the words;
mouth do the lip reading; AVSR, 3%. Not great, but
IBM’s Embedded Via certainly usable.

 ANALYZING DATA
The sensors and wealth of data. The next
embedded controllers in step is to have them
today’s cars collect a “phone home,”
transmitting that wealth measures. It involves
back to those who can use several reasoning
those data. Making sense techniques, including
of that detailed data is filters, transformations,
hardly a trivial matter, fuzzy logic, and
though especially when clustering/mining.Since
divining transient problems 1999, this sort of
or analyzing data about the technology has helped
vehicle’s operation over Peugeot diagnose and
time.IBM’s Automated repair 90% of its cars
Analysis Initiative is a data within four hours, and 80%
management system for of its cars within a day
identifying failure trends (versus days). An Internet-
and predicting specific based diagnostics server
vehicle failures before they reads the car data to
happen. The system determine the root cause of
comprises capturing, a problem or lead the
retrieving, storing, and technician through a series
analyzing vehicle data; of tests. The server also
exploring data to identify takes a “snapshot” of the
features and trends; data and repair steps.
developing and testing Should the problem
reusable analytics; and reappear, the system has
evaluating as well as the fix readily available.
deriving corrective

 RETRIEVING DATA ON DEMAND


“Plumbing”—the infrastructure knowing exactly where those data reside.
stuff. In time, telematics will be another Moreover, the server must scale to
web service, using sophisticated back-end encompass the increasing numbers of
data processing of “live” and stored data telematicsenabled cars, the huge volumes
from a variety of distributed, sometimes of data collected, and all the data out on
unconventional, external data sources, the Internet.
such as other cars, sensors, phone A future application of this
directories, e-coupon servers, even technology would provide you with a
wireless PDAs. IBM calls this its “shortest-time” routing based on road
“Resource Manager,” a software server conditions changing because of weather
for retrieving and delivering live data on and traffic, remote diagnostics of your car
demand. This server will have to manage and cars on your route, destination
a broad range of data that frequently, requirements (your flight has been
constantly, and rapidly change. The server delayed), and nearby incentives (“e-
must give service providers the ability to coupons” for restaurants along your way).
declare what data they want, even without
CONCLUSIONS:
Important issues few words. Successful
related to a driver safety, implementation of Safety
such as controlling Driver Manager would
Telematics devices and allow use of various
drowsiness were services in cars (like
suggested, which can be reading e-mail, navigation,
addressed by a special downloading music titles
speech interface. This etc.) without
interface requires compromising a driver
interactions with workload, safety. Providing new
dialog, event, privacy, services in a car
situation and other environment is important
modules. The basic speech to make the driver
interactions were showed comfortable and it can be a
that can be done in a low significant source of
resource embedded revenues for Telematics.
processor and this allows a The novel ideas in this
development of a useful paper regarding the use of
local component of Safety speech and distributive
Driver Manager. user interfaces in
An important Telematics will have a
application like Artificial significant impact on
Passenger can be driver safety and they will
sufficiently entertaining for be the subject of intensive
a driver with relatively research and development
little dialog complexity in forthcoming years at
requirements – playing IBM and other
simple voice games with a laboratories.
vocabulary containing a
REFERENCES
 www.Google.com
 www.esnips.com .
 www.ieee.org.

You might also like