Professional Documents
Culture Documents
IST 511 Information Management: Information and Technology Introduction To IST 511
IST 511 Information Management: Information and Technology Introduction To IST 511
Technology
Introduction to IST 511
Dr. C. Lee Giles
David Reese Professor, College of Information Sciences
and Technology
Professor of Computer Science and Engineering
Professor of Supply Chain and Information Systems
The Pennsylvania State University, University Park, PA,
USA
giles@ist.psu.edu
http://clgiles.ist.psu.edu
What is IST 511?
• Introduction to algorithmic/computational parts of IST
– There will be some maths
• Guide to research
– In information and related sciences
– In IST
– Illustrate the intellectual diversity of IST
• Methodology
– Read, view, discuss and write about ideas and papers in the field
• When possible, use examples of IST 511 research from IST grad students
– Write a research proposal paper and give a professional
presentation
• Focus on methodologies discussed here
IST 511
• Nearly all course material is at:
http://clgiles.ist.psu.edu/IST511
Read this page and links very carefully at least once a week
• Many people speak about the Information Age as the advent of the
Knowledge Age or knowledge society, the information society, the
Information revolution, and information technologies, and even
though informatics, information science and computer science are
often in the spotlight, the word "information" is often used without
careful consideration of the various meanings it has acquired.
How much information is
there in the world
Informetrics - the measurement of
information
• Stored
– What can we store
– What do we intend to store.
– What is stored.
• How do we use it
– Decision making
Information Age
• We have entered the information age
– What is the information age?
Exabytes
1,800
1,600
10-fold
DVD
1,400
Growth in 5 RFID
1,200 Digital TV
1,000
Years! MP3 players
Digital cameras
800 Camera phones, VoIP
Medical imaging, Laptops,
600
Data center applications, Games
400 Satellite images, GPS, ATMs, Scanners
Sensors, Digital radio, DLP theaters, Telematics
200 Peer-to-peer, Email, Instant messaging, Videoconferencing,
CAD/CAM, Toys, Industrial machines, Security systems, Appliances
0
2006 2007 2008 2009 2010 2011
• How big is five exabytes? If digitized with full formatting, the seventeen million
books in the Library of Congress contain about 136 terabytes of information; five
exabytes of information is equivalent in size to the information contained in
37,000 new libraries the size of the Library of Congress book collections.
• Hard disks store most new information. Ninety-two percent of new information is
stored on magnetic media, primarily hard disks. Film represents 7% of the total,
paper 0.01%, and optical media 0.002%.
• The United States produces about 40% of the world's new stored information,
including 33% of the world's new printed information, 30% of the world's new
film titles, 40% of the world's information stored on optical media, and about
50% of the information stored on magnetic media.
• How much new information per person? According to the Population Reference
Bureau, the world population is 6.3 billion, thus almost 800 MB of recorded
information is produced per person each year. It would take about 30 feet of
books to store the equivalent of 800 MB of information on paper.
Information Census
Lesk
EB
Varian & Lyman
PB
• ~10 Exabytes
• ~90% digital TB
• 4 MB
• 50x24” disks
• 1200 rpm
• 100 ms access
• 35k$/y rent
• Included computer &
accounting software
(tubes not transistors)
1.6 meters
10 years later
30 MB
Now - Terabytes on your desk
Terabyte external
drive for
$200 - 20 cents a
gigabyte.
In 5 years, 1
cent/gigabyte, $10
for a terabyte?
Now - Terabytes on your desk
Moore's Law:
• Improvements:
1E+4 58.7%/y
Bandwidth 40%/y
1988 1991 1994 1997 2000
Shrinks time
now or later
Locate
Shrinks space Process
here or there Analyze
Summarize
Automate processing
knowbots
Memex
As We May Think, Vannevar Bush, 1945
•
• Complexity of the World
Capture
Representation
Apply
Representation as Information:
What Makes a Good Representation?
•
Modeled by
sine wave
Information Processing
• There are many ways to apply the information stored in
representations.
• Retrieval
– Finding useful information
• Recognition
– Identifying an instance
• Inference
– Extend stored information to a new situation
Context
• One of the hardest problems for
information processing is determining the
context in which the information is
applied.
• This may lead to incorrect inferences.
• Some say information is data in context.
People and Information
• People process information based on their
experience and context.
• Human information processing is affected
by emotions and needs.
• Your data may be my information
What is an information system?
• Processes information
• Requires knowledge of what information is
• How much information is available
– Static vs dynamic
– Explict vs implicit
• How it is used and structured
– information management
• How it’s managed
• Incorporated into personal or social use.
Information Characteristics
Knowledge
Intelligence
Information
Facts
What is knowledge?
• Knowledge - A more complex view considers
knowledge as intrinsically different from
information. Instead of considering knowledge as
richer or more detailed set of facts, we define
knowledge in an area as justified beliefs about
relationships among concepts relevant to that
particular area.
Is Information
• An aspect of intelligence?
– Derivative to its use
• An aspect of life?
• Innate to physical reality?
– Innate code, ex DNA, etc.
Characteristics of Information
– Invariant
– Dynamic
– Personal
– Situational
– Cultural
– An act versus a fact
– Additive
– Symbolic
– Others?
Information Theory
• Information theory is a discipline in applied
mathematics involving the quantification of data
with the goal of enabling as much data as possible
to be reliably stored on a medium or
communicated over a channel.
• The measure of information, known as
information entropy, is usually expressed by the
average number of bits needed for storage or
communication.
– The more common the event, the higher the entropy
http://en.wikipedia.org/wiki/Information_theory
Claude Shannon
• Claude Shannon is the creator
of “information theory”
• The definition was not a broad
definition of “information”
nor it was others were
referring to information at that
time and even now.
• However, the definition can
be quite useful
Models of Information
• Common model: a representation of data
– When possible formalize the information process
– Interoperability
– Standards
• What is formalization?
– Logical or mathematical representation
• Natural language definitions are becoming formal
– Why formal definitions of information?
– Examples?
Formalization/automation/digitization
of Information
Advantages:
• Costs
• Reproducibility
• Scalability
• Automation
• Interpretation
• Others?
Consequences of Information
• Information can lead to
– Decisions
– Actions
– Contemplation
– Laws
– More information
Models of Information Use
• Personal models
– Cognitive
• Social models
– Institutions
– Groups
– Nations
– Commerce
– Etc.
What is Information?
• There is no standard definition
• Context is important; maybe vital
– "Information is produced when data are processed so
that they are placed within some context in order to
convey meaning to a recipient."
• Information causes things to happen
– Permits decisions, actions, predictions, etc.
• An innate aspect of intelligence/universe?
The Philosophy of Information: A Definition
L. Floridi
What is the Philosophy of Information? (2002)
• Artificial Intelligence:
– speech recognition
– Some reasoning; computer beats man in
chess
– Privacy and security problems
– Computers can be a pain in the butt
WRONG!
• L. Floridi, Hertfordshire
• Wikipedia