You are on page 1of 22

Mini Wiki Explorer

Proposal

Jan 31, 2010

WASK while you do task


A
B A hand-held device with touch-screen display that enhance the listening and grasping
O experience for user by performing several tasks. For instance, you are attending a
U seminar & generally it happens when you are listening to speaker, some of his word
T lose their importance as they seems to be unknown to you and the only thing you can
do at that time is just wrote it down and hope to have some gadget which can search
W and providing you beam of knowledge. Moreover you don't want to type any word.
A Then what will one do?
S
What an idea Sirji, a multi task hand held device exclusively used for real-time Wiki ASK
K
simultaneously at the time of listening someone. This is a project to deliver an
offline/online help anytime wherever you are, just speak into it & leave out all the rest
on it. Generally some words cause over-head transmission (create misunderstanding
about that topic, make it difficult to understand) of that whole lecture. That problem
can be erased using this device. Actually get rid of conventional typing method or
searching for the meaning of words in the dictionary for minutes, just speak & it will do
job for you.
Problem Statements:

P: S:
Attending a seminar is good for nothing if
you don’t remember what you just went Our device converts whatever one say into text which directly enhances
through! user's interaction with the device & helps you to keep record of the lecture.

S: P:
Generally some words cause over-head
Due to its touchpad, you can easily extract phrases, words from transmission (create misunderstanding about
converted text, user can query Wikipedia (largest online encyclopaedia) the topic, making it difficult to understand) of
for those popped out texts making things easier to grasp. the whole lecture.

P: S:
During listening it’s hard to make
notes in the talk if user feel a need We provide user a very efficient way to take notes. As real time speech to text
of it, risk is if he starts making conversion is very fast of our device, you can simultaneously highlight sentences by
note a few part of that talk can be just tapping on it. And you can perform some after talk is over. So take notes, ask for
skipped. images if you want to see it, add tags, save, email, print. Isn’t it so simple?

S: P:
While listening to speaker if one
User can search eBooks or books available on any particular. On the backhand our feel need of eBooks related to any
device will search books from Google Book, Amazon, Flipkart etc. so it would be topic discussed during talk. It would
efficient and reliable. be difficult to search at that time.

Extra features which could be proved useful for many more people

Tweet with me:- Synchronization:


If micro-blogging fans just has to say what they
wants to tweet & it will be done then that is User can easily synchronize this device with computer and
awesome. Just speak what you want to say & it see saved notes in computer can transfer them from
will be uploaded to your twitter account. device to computer & computer to device.

WASK while you do task


P
R BD
O SO we provide you solution of this kind of problem by creating a prototype that
P can hear speaker, convert it to text, pop out important words, ask you to
O choose any word which you find tough enough to understand. It will search
S HS that word or phrase in Wikipedia and show output of it. According to your
E convenience translates it in language you want convert it again to speech and
D gives you speech signals.

S SS
O
L
U
T IAI
I
O
N
TP

WASK while you do task


B
L
O
HS
C
K

SS

D
I
IAI
A
G
R
TP
A
M

Component Descriptions:

Intel® System Controller Hub US15W :

It is a highly integrated chipset that


addresses key requirements of thermally
constrained and fanless embedded
applications. It combines and controls the
Intel® Graphics Media Accelerator 500
(Intel® GMA 500), memory controller, and
I/O controller in a single-chip solution
while featuring advanced 3D graphics and
extensive I/O capabilities such as USB 2.0,
SDIO and PCI Express. It work as main
control unit in the device & controls
various features like graphics and display,
audio, video, interface. The more
information about it can be seen here.

WASK while you do task


B
L
O
C HS
Intel Atom Processor:
K
A microprocessor incorporates most or all of the functions of a computer's central
SS processing unit (CPU) on a single integrated circuit (IC, or microchip). It’s an ultra-low-
voltage x86 and x86-64 CPUs (or microprocessors) from Intel. Designed from the ground
D
up, 45nm Intel® Atom™ processors pack an astounding 47 million transistors on a single
I chip measuring less than 26mm², making them Intel's smallest and lowest power
IAI
A processors.¹ All this while delivering the power and performance you need to access all
G Internet capabilities.
R
TP
A
M

Disk Drives:

Solid-State Drive: Hard Disk Drive:


CFD Connectors:
A solid-state drive (SSD) is a A hard disk drive (often
CFD cables are coaxial
data storage device that uses shortened as hard disk, hard
cable used to connect
solid-state memory to store drive, or HDD) is a non-volatile
devices.
persistent data & we are using storage device that stores
SODIMM DDR2 400/533 MHz digitally encoded data on rapidly
as memory device in this device rotating rigid (i.e. hard) platters
with magnetic surfaces.

A dual inline memory module (DIMM) consists of a number of memory components (usually black) that
are attached to a printed circuit board (usually green). The gold pins on the bottom of the DIMM provide a
connection between the module and a socket on a larger printed circuit board. The pins on the front and
back of a DIMM are not connected to each other.

WASK while you do task


B
L
O
C HS

K RAM:

SS DDR2 SDRAM is a double data rate synchronous dynamic random access memory interface.
DDR2 stores memory in memory cells that are activated with the use of a clock signal to
D synchronize their operation with an external data bus.
I
A IAI The memory socket in this device will support at least 2 GB DDRII 400/533 memory which
G introduces 166 MHz memory clock, I/O Bus clock 266 MHz with the data transfer rate of 533
million & peak transfer rate 4266 MB/s.
R
TP
A
Universal Serial Bus:
M
: USB (Universal Serial Bus) is a specification to establish communication between devices and a
host controller (usually personal computers). USB is intended to replace many varieties
of serial and parallel ports. USB can connect computer peripherals such as mice,
keyboards, digital cameras, printers, personal media players, flash drives, and external hard
drives.

In this device at least 4 USB 2.0 ports ( 3 host, 1 client) is used .USB 2.0 has maximum data rate
speed 480 Mbit/s.

Mini Card: RJ45: Intel Gigabit Ethernet chip:


: PCI Express Mini Card (also : RJ45 is one of the : Gigabit Ethernet (GbE or 1
known as Mini PCI Express, many registered jacks. It is GigE) is a term describing
Mini PCIe, and Mini PCI-E) is a often incorrectly used as the various technologies for
replacement for the Mini name for the 8P8C modular transmitting Ethernet fram
PCI form factor based on PCI connector used to es at a rate of a gigabit per
Express. The host device terminate Ethernet cable. It second, as defined by
supports both PCI Express specifies both the physical the IEEE 802.3-
and USB 2.0 connectivity. Mini connector and wiring 2008 standard.
Card edge connector provides pattern.
multiple connections and We will be using this chip
buses: such as PCIe ×1, USB to enable wired network
2.0. facility in the device.

Intel High Definition Auio:

: Also called HD Audio or Azalia. It is used for delivering high-definition audio that is capable of playing back more
channels at higher quality than previous integrated audio codec’s like AC'97.
:
In this device two audio jacks will be used one for microphone input jack & the other one is speaker output jack.

WASK while you do task


B
L
O
Serial Digital Video Out: LCD:
C
K : SDVO makes it possible to use a : A liquid crystal display (LCD) is a
16-lane PCI express slot to add thin, flat panel used for
SS : : electronically displaying
additional video signalling
information such as text, images,
D : interfaces such : and moving pictures.
as VGA and DVI monitor
I IAI We will be using 7’ or 12 ‘ LCD
outputs, SDTV and HDTV televisio
A n outputs, or TV tuner inputs to a display for the output display
G device.
system board containing an
R TP integrated Intel 9xx-series
A graphics processor. Here it can be
M placed on a PCI express card,
allowing video connectors to be
added or exchanged at low cost.

Nutshell what it have as its Hardware Specifications are:

N270 Processor and 945GSE Chipset


Product: Clientron E830 Sunshine Valley
Features:

 Intel® Atom™ N270 processor with 512KB L2 cache 1.6 GHz FSB 533 MT/s at 1.6 GHz

 Mobile Intel® 82945GSE Express Chipset (GMCH) 82801GBM (ICH7M)

 Supports DDR2 frequency of 533 MT/s or 400 MT/s, single channel, 2GB max

 USB ports, 1 PCIe* 1.1 X1 slot, 1 PCI 2.3 slot

 Integrated graphics one DVI-I and one VGA Connector

 PCI Express* n-Board LAN

 PCI Express* n-Board LAN

 Audio RealTek* ALC268

 ATA/Storage

 IDT* ICS9LPRS501 system clock generator

WASK while you do task


S
O
F Operating System: Moblin OS , short for 'mobile Linux', is an open source operating
T
system and application stack for Mobile Internet Devices (MIDs),netbooks, nettops and
W
embedded devices
A
R
Software Solution: pytgtk application based on Juilus Open source speech recognition
E engine.
Most important aspect of this device is the software stack we will be using, which includes a
S
voice to text converter, which includes a voice to text converter, after researching we found
P IAI
that work on this has been done on OLPC sugar environment, significant research has been
E done in this filed regarding the low power machines and efficient speech conversion.
C We quote here text from
I TP http://wiki.laptop.org/go/Speech_to_Text which describes
F Various free and open source solutions available for this
I And a nice comparison of them:
C Technically, Julius and Sphinx seem to be the best choices. VoxForge supports both of them
A and they both are widely used. Sphinx comes in different flavours quite confused by version
T numbers. Most notable are Sphinx 3 and 4. Sphinx 3 was written in C and later Sphinx 4 was
I released as a complete rewrite in Java. Some points in favour of Julius are:
O  Julius is better suited for dictation purposes which are what we are looking for here.
N
S  Simon project has done some research to rate the Speech to Text engines. Since they
have practically tried it, Julius seems to have scored off well.

 Testing of Julius on various machines (and different OSes) showed that Julius needs
no additional configuration for installation.
Some references:

 Julius, http://sourceforge.jp/projects/julius

 VoxForge, http://www.voxforge.org/

 CMU Sphinx, http://cmusphinx.sourceforge.net

4. Simon, http://sourceforge.net/projects/speech2text/
Offline Wikipedia reader
The Offline Wikipedia reader is a set of scripts and programmes which can be used to display Wikipedia
pages without an internet connection. The software provides a custom lightweight web server running locally
and uses php to present the pages, which are then viewed using any web browser. All Wikipedia pages are
contained in the Wikipedia page dump. The file needed is called 'pages-articles.xml.bz2', and the most
recent is 4.1GB.

At present, a single tar.bz is downloaded from the site, the pages extract downloaded and copied to the
location, and the indexing process run.

Except this offline reader, online help will also be available in the device for live solution.

WASK while you do task


S
O
F
T Wikipedia: Tools
W Various tools from Wikipedia will be used in this device intended to simplify, make more efficient, or provide
A additional functionality to Wikipedians. Like:
R  Browser & editing
E
 Searching
S
IAI  Downloading
P
E  Google Tools
C TP  Page histories
I
F
I Drivers/Development Kits
C  Intel® System Controller Hub US15W drivers
A
 Intel® Atom™ processor Z5xx series and Intel® System Controller Hub US15W development
T
kit.
I
O  Intel® Atom™ processor Z5xx series and Intel® SCH US15W VirtualLab
N
 Intel Compilers
S
 On-chip Debuggers

 Chipset Drivers

 Intel High definition audio driver

 Bios Drivers

 Graphics Drivers

 QNX* Fastboot Initial Program Loader (IPL)


Except these solutions we need to have Twitter API’s to embed it into application which will
let users to tweet directly through speaking...

& Google AJAX Language API code for the language translation between various languages so
that user can swap between different languages according to their suits.

WASK while you do task


S
O Development and Deployment Environment
F
T We need two Linux environments. We used one machine with dual boot for each
W environment:
A 1. Development environment - a Linux development Machine where we'll be
R doing our coding, compilation, debugging and packaging.
2. Deployment environment - a machine running Moblin We'll deploy our
E
project to this machine for testing and debugging.
S
IAI
P Development environment
E
A Linux machine running an up-to-date Linux distribution is Suitable for Moblin
C TP development. Our development environment is:
I
F Fedora 12 x86 64 bit
I
C 1. Installed the standard set of Linux development tools, as described in Installing
A Linux development tools.
2. Installed the Moblin SDK components. To set up tools for use in Moblin
T
development, a toolchain containing the required versions of the Moblin
I libraries, and the Anjuta IDE with Moblin plugin.
O
N Installing the Moblin SDK components
S
The Moblin SDK components provide assistance at each stage of Moblin application
development. They comprise:
 Moblin toolchain This is a directory which contains the Moblin operating system,
including header files, Libraries, and package information. It allows you to run
Moblin applications on your workstation, build and Debug code in an editor of
your choice, and do remote debugging. Install them using the Moblin
toolchain instructions.
 Linux Project Generator This graphical application generates GNU auto tools-
enabled projects..
Install it using the Linux Project Generator instructions.
 Moblin Package Creator This tool can be used to generate an RPM and/or DEB
package from your project, suitable for installing via the Moblin Application
Installer. It has both command line and graphical interfaces.
Install them using the Moblin Package Creator instructions.
 Anjuta Moblin SDK plugin This provides Anjuta IDE integration for building
Moblin applications.
Install Anjuta and the plugin using the Anjuta Moblin SDK instructions.

Deployment environment
Since Moblin simulators are yet to be developed, we need to setup a machine with
moblin to test application. We dual boot our system for this.

WASK while you do task


I
N
T
E
L

A
T
O
M TP

I
N
T
E
G
R
A
T
I
O
N

WASK while you do task


T
E
S Testing is an important part of any project. Testing is an activity that can go on forever so it is
T important to test the areas that are most important and the areas that are most likely to
reveal bugs. Like any project, exhaustive testing is not possible. Testing is a highly strategic
I
activity and different methodologies are used.
N
G Exploratory Testing

Exploratory testing is one of the newer testing methodologies. It is a type of manual


P testing where no formal plan is made. It involves ‘exploring’ the product, targeting areas
that are likely to reveal the most bugs, almost like a mission. It is also known as ad hoc
L
testing but this word is usually viewed with negative confutation portraying a sloppy and
A
careless method of testing.
N
S
The Test Plan

A complete test plan discussing the schedule of testing and the strategy to take will be
developed at the later stages of the project development life cycle. It is important to
priorities the test cases running the ones that are most likely to produce a bug and the
ones that are of highest risk if they were to fail.
The categories of the test cases run are as follows:
Functionality and System Tests
Unit Test
The test cases in this category deal with
The first test in the development process is the unit test. The
testing device’s functionality and
source code is normally divided into modules, which in turn are
ensuring the all the requirements work
divided into smaller units called units. These units have specific correctly. This involves testing every
behaviour. The test done on these units of code is called unit test. area of functionality that the end user
Unit test depends upon the language on which the project is can perform.
developed. Here it ensures that each unique path of the project
performs accurately to the documented specifications and
contains clearly defined inputs and expected results. Beta or Third Party Tests

Stress Tests We will pick a group of people and tell them


to use our device as any normal device and
It is important to ensure that wikilistener will report any bug they find. This can be
always have enough resources when it is performed using a bug tracking system where
stressed. Tests in this category dealt with running the colleague would report any bugs they
wikilistener for extended periods of time in an find.
effort to crash it. It is not unusual to see
programs crash over time. For example, a Black Box Testing
memory leak or buffer overflow would
Testing software without any
Become evident in these circumstances.
knowledge of the inner workings,
structure or language of the module
being tested.

WASK while you do task


W
O We have studied all the resources materials we can get from internet till now. We know what exactly we
R are about to made and how we'll achieve our goal. We have read the datasheet of atom processors
K available on internet. We have done all the research work related to interfacing, programming, and
prototyping and how to configure software’s, calibration and working on word sense disambiguation
D algorithms. What we are up to is to utilize the atom’s processing power to the max level for better
O performance & usability.
N Work done so far includes:
E
 According to the give Application References Designs, the proposed device is based on the ‘Media
Phone Reference Design based on Intel Atom Processor.’ This reference is specifically dedicated to
VoIP, video and converged communications providing the suitable design for our device. The
hardware circuit design of the device has been made.

 All the necessary hardware requirements have been listed according to the proposed design.

 Except hardware solutions, software solution is what we are working mainly on right now. Like
choosing the right development & deployment environment that suits the requirements as is stated
above. Operating system that are to be chosen are Linux, Moblin, & Windows. The choice depends
on the availability of the drivers, BIOS & other development kits what Os they support.
Except it Linux & Moblin seems to be the best choice as they support Intel atom based applications.

 Besides these we have tested Julius speech to text conversion application. The snapshots are given
below. We successfully managed to install it on our system & recorded & converted voice with it
into text.

 Installed the standard set of Linux development tools, as described in Installing Linux development
tools.
 We prepared our Fedora machine with the developments application for the better use with the
Fedora Electronics Labs as is described here.
 Google Labs helps in understanding how things will work to fulfil the translations & image
searching.
 We worked with existing solutions available in market today for better understanding like:
o Natural speech recorder & speech to text converter:-SONY ICD-SX57 is one of the
devices which stores voice & convert it into text. For more reference click here.
o Wikipedia search (Word wise): This feature is present in Mobipocket device which open
eBooks & enable word wise wiki search. For more info here.
 We used Offline Wikipedia reader which is a set of scripts & programmes used to display Wikipedia
content without internet connection.
 Using Twitter API’s we managed to tweet or get updates in the account. Info can be found here
about API’s. We implemented just for testing twitter API using Firefox extension “JETPACK”. This
only gets updates from the account on some specific time interval. Source code can be found here.
 For the UI part we will use our coding skill Qt Designer can help in making better user interface.
Except it, all the software can be developed using various Free & Open Source Software Solutions available.
Most of the prototyping and modelling work has been done so far and refining of the ideas and finding out
better ways of implementing is what we are working upon.

WASK while you do task


W
O
Snapshots of work done so far:
R
K

D
O
N
E

/* Adinrec , a component of Julius used to record voice can be seen working */

WASK while you do task


W
O
R
K

D
O
N
E

/* /home/kunal/sounds & /usr/local/bin */

WASK while you do task


W
O
R
K

D
O
N
E

/* Wiki reader Installation */

/* Moblin Installation */

WASK while you do task


W
O
R
K

D
O
N
E

/* Moblin in running state */

WASK while you do task


WASK while you do task
WASK while you do task
WASK while you do task
WASK while you do task
WASK while you do task