Professional Documents
Culture Documents
2003
WWW-Lib-MM
Contents
Database Systems and WWW Applications
Internet DB Architecture Internet Applications
2003
WWW-Lib-MM
Browser
Client Tier
HTTP
WEB/APP Server
HTTP
Application messages
Middle-Tier Application
Gateways
Data Sources
ORDBMS
Other
Data Sources
Nori, A., Databases in Internet Applications: Case Studies, in: Postmodern DBS, UC Berkeley, Spring 1999
2003
WWW-Lib-MM
Entertainment Games, Music, Films, Multi-person chat Public information Maps, Tax return helper Advertisement Interactive catalogues for products and services Medicine Diagnosis, Consultation, Remote surgery Education Learning-on-demand (for a degree), virtual museums, tour remote spaces Engineering Collaborative design, remote parallel simulation services Publishing Submit, Review, Proof-editing (text and graphics) Tele-communication Conferencing ...
2003
Internet Applications
WWW-Lib-MM
Contents
Database Systems and WWW Applications Digital Libraries
Definitions Underlying concepts Digital Libraries Initiative Digital Libraries (examples)
2003
WWW-Lib-MM
Definitions
In the Stanford Digital Library project, we view long-term digital library systems as collections of widely distributed, autonomously maintained services. Of course, a digital library system must include services that allow users to search over collections of information objects. Examples of searchable collections include traditional library collections, digital images, e-mail archives, video, on-line books, and scientific article citation catalogs (containing only metadata about the articles, not the articles themselves). While searching services are valuable, they are not the only kind of service in the digital library of the future. Remotely usable information processing facilities are also important digital library services. These services provide support for activities such as document summarization, indexing, collaborative annotation, format conversion, bibliography maintenance, and copyright clearance. The Stanford Digital Library Technologies Project
2003
WWW-Lib-MM
Definitions
Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities.
The Digital Library Federation (DLF)
Note:CS researchers tend to focus on digital libraries as content collected on behalf of user communities, while librarians focus on digital libraries as institutions or services.
2003
WWW-Lib-MM
Contents
Database Systems and WWW Applications Digital Libraries
Definitions Underlying concepts Digital Libraries Initiative Digital Libraries (examples)
2003
WWW-Lib-MM
Notions
Content the items in the library collection Annotation information added to (associated with) an item Subject matter focus of a collection, topics used in classification Catalog database (card file) of bibliographic records Classification assigning call number, adding keywords Rights to use - permissions License agreements contractual right to use Copyright Watermark a subliminal pixal pattern to identify a digital work Copy detection verifying copying, searching for copies Search (40% of search queries on the web are reported to be single words) Metasearchers (services that provide unified query interfaces to multiple search
engines. Thus users have the illusion of a single combined document source. Three main tasks: choosing the best sources to evaluate a query; evaluating the query at these sources; merging the query results from these sources.)
2003
WWW-Lib-MM
File Formats
Image/graphics formats
TIFF GIF JFIF SPIFF PICT TGA EPS CGM PhotoCD Tagged Image File Format Graphics Image File Format JPEG File Format Still Picture Interchange File Format Macintosh Picture TrueVision Targa file (bit mapped images) Encapsulated PostScript Computer Graphics Metafile (Kodak) Joint Photographic Expert Group Moving Pictures Expert Group (Adobe) Portable Document Format (Adobe)
WWW-Lib-MM
10
Document formats
PostScript PDF
2003
Compression
Compression lossless
color 25%-50%-67% B/W 50%-90%;
lossy up to 95% Compression formats CCITT Group III or Group IV JPEG JBIG An international compression standard LZW.Subsampling (lossy) Compression schemes LZW Lempel-Ziv-Welch (lossless) MPEG Group of Pictures: IBBBPBBBPBI QuickTime (Apple)
2003
WWW-Lib-MM
11
WWW-Lib-MM
12
2003
WWW-Lib-MM
13
2003
WWW-Lib-MM
15
Contents
Database Systems and WWW Applications Digital Libraries
Definitions Underlying concepts Digital Libraries Initiative Digital Libraries (examples)
2003
WWW-Lib-MM
16
WWW-Lib-MM
17
WWW-Lib-MM
18
WWW-Lib-MM
19
Multivalent Documents
Multivalent Annotations Stored separately from the document they annotate Appear in situ as if part of the content of the document
Hyperlinks Highlights Notes Copy editor markup (executable)
Combining Annotations
Notemarks
E.g., outlining, man pages 2003
WWW-Lib-MM
20
10
Robust Hyperlinks URLs are augmented with a five or so word content-based lexical signature to make a robust hyperlink If the URL's address-based portion breaks: Feed the signature into any web search engine to find the new site of the page.
2003
WWW-Lib-MM
21
TilePics
A file format designed to store tiled data of arbitrary type in a hierarchical, indexed format in order to provide fast retrieval.
a fixed sized header tile index data an optional gap contiguous tile data optional attribute data
Zoom by drawing just the relevant tiles at the next layer down
2003
WWW-Lib-MM
22
11
2003
WWW-Lib-MM
23
Archival Repository
Web Server
Info Monitor
File System
2003
WWW-Lib-MM
24
12
notifies the content owners of illegal copies. Key challenges accuracy, in terms of high precision and recall, scalability, in terms of coping with several terabytes of data (or several tens of millions of web pages) resiliency to attacks Two prototypes SCAM (Stanford Copy Analysis Mechanism, for text) FRAUD (Finding Replicas of AUDio)
WWW-Lib-MM
25
2003
B
B and C are backlinks of A
A C
Approximation of importance Citation analysis literature Citation indexes Extreme variation in importance
Idealized Model
ni =
l j=1, ij
N
i,j
number of outgoing links on page i (includes multiple links to the same page) Wi ) PageRank of page j ni
26
l1,2 = 1 l2,1 = 0
2
Wj =
(l i=1, ij
i,j
2003
WWW-Lib-MM
13
WWW-Lib-MM
27
Contents
Database Systems and WWW Applications Digital Libraries
Definitions Underlying concepts Digital Libraries Initiative Digital Libraries (examples)
2003
WWW-Lib-MM
28
14
2003
WWW-Lib-MM
29
2003
15
2003
WWW-Lib-MM
31
Contents
Database Systems and WWW Applications Digital Libraries Multimedia Database Systems
Definitions Example Application
MM QoS Requirements
2003
WWW-Lib-MM
32
16
Definitions
Multimedia (MM); loosely: any system that can be used to present information in more than one form: text, graphics, still images, animation, sound, video, special computer-generated effects. The system should have user-friendly interactive interfaces that help the communication of complexly structured data. MMDBSs: are the DBSs that manage MM data, facilitate MM for presentations, and use specific tools for the storage, management, and retrieval of MM data.
2003
WWW-Lib-MM
33
Multimedia Applications
Entertainment Public information Advertisement Education Medicine Engineering Publishing Tele-communication ...
2003
WWW-Lib-MM
34
17
Multimedia server
Storage
Buffers
Graphics/video hardware
Network
Buffers
Audio hardware
Client
2003
WWW-Lib-MM
35
Contents
Database Systems and WWW Applications Digital Libraries Multimedia Database Systems
Definitions Example Application
MM QoS Requirements
2003
WWW-Lib-MM
36
18
2003
WWW-Lib-MM
37
WWW-Lib-MM
38
19
Contents
Database Systems and WWW Applications Digital Libraries Multimedia Database Systems
Definitions Example Application
MM QoS Requirements
2003
WWW-Lib-MM
39
2003
WWW-Lib-MM
40
20
Requirements - 2
MM data storage and retrieval:
MM & object-oriented data modeling concepts; management of several kinds of magnetic and optical storage devices appropriate for MM data handling; uniform management of very large data volumes => management of tertiary storage and multi-level storage hierarchies; support for realtime data processing => appropriate scheduling and resource allocation techniques; support for storage and processing parallelism (performance requirements); support for distribution => appropriate distributed DBMS concepts.
2003
WWW-Lib-MM
41
Specifications
1 channel, 8-bit samples at 8 kHz Equiv. to CD quality
2 channels, 16-bit samples at 44.1 kHz MPEG2-encoded video 640x480 pixels/frame, 24 bits/pixel NTSC-quality video 640x480 pixels/frame, 24 bits/pixel HDTV-quality video 1280x720 pixels/frame, 24 bits/pixel
2003
WWW-Lib-MM
42
21
Requirements - 3
Realtime and synchronization issues:
soft realtime transfer requirements hard transaction deadlines synchronization between different data streams (data types) user interactions (synchronous and asynchronous)
=> dependent on data distribution, storage devices, compression techniques for the various data types, buffer management techniques, scheduling algorithms, data placement techniques, and communication bandwidth
2003
WWW-Lib-MM
43
Contents
Database Systems and WWW Applications Digital Libraries Multimedia Database Systems
Definitions Example Application
MM QoS Requirements
2003
WWW-Lib-MM
44
22
DBMS Concepts
Data modeling: temporal object-oriented modeling and presenting (HCI) of multimedia data + extra data types & operations Query processing and optimization: browsing, content addressing Storage management: optimization techniques Transaction management: realtime processing for read transactions (presentations), write transactions (authoring) use a advanced transaction model (e.g., checkout-checkin with versioned data)
2003
WWW-Lib-MM
45
2003
WWW-Lib-MM
46
23
Temporal relationships: - Synchronization and realtime processing Quality-of-Service: - to handle average delay, speed ratio, utilization, jitter, skew, and reliability.
2003
WWW-Lib-MM
47
2003
WWW-Lib-MM
48
24
2003
WWW-Lib-MM
49
Concepts of TOOMM
Presentation Model P_Video 13 Video 1 P_Video 14 CPO1 P_Video 15 Composite Presentation Object P_Audio 11 Video 2 Audio 1 Logical Data Model
Multimedia Objects
50
WWW-Lib-MM
25
Video 1
Frame n TA n Timestamp n
51
2003
WWW-Lib-MM
Type Hierarchy
MMDT
Stream
CGM
LDU
Event
Video
2003
Audio
Music Animation
Frame
Sample
Note
Anim.
52
WWW-Lib-MM
26
Play Time
Components of a stream multimedia object TA 0 TS 0 LDU 0 TS 1 TA 1 LDU 1 TS n TA n LDU n Play Time event 0 TS 0 event 1 TS 1 event n TS n
TA 0
TA 1
TA n
WWW-Lib-MM
53
EER Diagram
Temporal_reference
0:1 1:1 1:1 1:1
CPO
1:1 1:1 1:M
1:1
0:M 1
start
1:1
stop
2 1:1
0:M
MMDT
1:1
0:M
P_MMDT P_PTD_MMDT
P_PTI_MMDT
P_Stream
P_CGM
WWW-Lib-MM
27
P_Video 2 2 P_Text 3 3
time
WWW-Lib-MM
Example CPO
Type: CPO Name: Lecture_19_2_1998 MTU_duration: 1/44100
Type: P_Video Name: P_Video 1 Speed: 1 Start: 0 Stop: 18000 p_start.get_time_point()=0 p_stop.get_time_point() = 31752000 Type: P_Audio Name: P_Audio 1 Speed: 1 Start: 0 Stop: 31752000 p_start.get_time_point()=0 p_stop.get_time_point() = 31752000 Type: P_HTML Name: P_HTML 1 p_start.get_time_point() = 3987233 p_stop.get_time_point() = 7234443 Type: P_HTML Name: P_HTML 2 p_start.get_time_point() = 10234234 p_stop.get_time_point() = 16230933 Type: P_Light_Pen Name: P_Light_Pen 1 p_start.get_time_point() = 4457111 p_stop.get_time_point() = 6283324
Type: Video Name: PMC_Lecture_hour1_scene1 LDU_duration: 1/25 Duration: 1800 Content description: - (0, 4988, Lecturer talks about files) - (4989, 12134, Lecturer talks about directories) Type: Audio Name: PMC_Lecture_hour1_clip_1 LDU_duration: 1/44100 Duration: 31752000 Content description: - (0, 4988, Lecturer talks about files) - (4989, 12134, Lecturer talks about directories) Type: HTML Name: File System
Type: Light_Pen Name: Drawing_objects LDU_duration: 200 Content description: - (0, 100, Draw a bow in File System) - (101, 200, Draw a dot)
2003
WWW-Lib-MM
56
28
- Content addressing:
efficient location of data with complex data types like images (difficult to access in realtime using pattern-recognition techniques) comprises: natural language understanding, speech processing, vision, and user modeling
2003
WWW-Lib-MM
57
Meta-Data Management
Meta-data needed especially for continuous data to support retrieval Textual data describing contents of audio and video segments Content search mostly performed on meta-data Problems: Modeling of meta-data Meta-data acquisition Association of meta-data to real data
2003
WWW-Lib-MM
58
29
2003
WWW-Lib-MM
59
Data Placement
Clustering and partitioning: data striping and data interleaving
Controller
Logical sector 0
2003
WWW-Lib-MM
60
30
Disk Scheduling
Traditional algorithms FIFO (first come, first served) SSTF (shortest seek time first) SCAN (elevator algorithm) 1.Generation MM algorithms EDF (earliest deadline first) SCAN-EDF GSS (grouped sweeping scheme) 2.Generation MM algorithms two-phase algorithms - reduce seek time - reduce rotational latency - increase throughput - fair stream access - real-time constraints?
2003
WWW-Lib-MM
61
2003
WWW-Lib-MM
62
31
storage structures / data placement techniques query optimization transaction management mechanisms
2003
WWW-Lib-MM
63
MMDBS: Conclusions
investigated functionality needed to support MM applications illustrated how object-oriented and other modern DBMS technologies can be applied to realize MMDBMS alternative levels of application support by DBMS open issues: - effective storage models - MM query languages and processing techniques (handling of imprecise queries) - ... Role of (MM)DBS in distributed MM systems
2003
WWW-Lib-MM
64
32
Conclusions - State-of-the-Art
Multimedia file systems and multimedia storage servers for special multimedia applications exist today Implement the presented concepts Acceptable performance Multimedia database systems are still under development, certain aspects are solved Retrieval problems not yet solved in a satisfying manner
2003
WWW-Lib-MM
65
33