Video Processing

CSC361/661 -- Digital Media Spring 2004 Burg/Wong

What’s the difference between analog and digital video?
 

The way the signal is sent Every pixel element and audio sample is a stream of 0’s and 1’s


What’s so good about digital video?
  

 

 

Less subject to noise. Can be stored and transmitted on a computer Digital editing techniques are powerful and also allow the video to be integrated into othere multimedia applications. Editing can be non-linear and non-destructive. Interactive elements can be added during editing. Can be reproduced repeatedly without degradation of quality. Can be easily compressed and encrypted. Can be replayed non-linearly and in still images. One digital video file can be replayed with different settings depending on the system 3

Organization of Video Signals

 

Component video – 3 wires/connectors connecting the camera or other devices to a TV or monitor; three separate signals for the image components YUV or YIQ works well for analog video Advantage: The 3 components of the signal are on separate lines, so there isn’t any electromagnetic interference among them. Disadvantage: Requires more bandwidth and synchronization.

Organization of Video Signals

Composite video – all the image information (e.g., YUV) sent on one line or channel. Luminance and chrominance components are separated on the receiver end. Connecting a TV with a VCR can be done this way – one connection (audio signal connected separately.) There can be some interference among YUV components.

Organization of Video Signals
 

 

S-video – a compromise between component and composite video. Uses two wires or channels – one for luminance and one for chrominance. Not as expensive as component video. Less interference between the two compared to composite signal.


Scanning Methods
 

Interlaced – odd lines are displayed first, then even lines Taken together, all the odd lines are called a field (and similarly for all the even lines). The original purpose of interlaced display was to avoid flicker. Standard television uses this method, displaying 60 fields per second (which makes 30 frames per second).

Scanning Methods
 

Progressive – the method used by computer monitors. An entire screen can be written to a buffer. The buffer is displayed “instantaneously.” Think about how analog video would be converted to digital – the fields would have to be put back together, which can create interlacing artifacts. 8

Standards Organizations for Video (originallly analog, extended to digital)

NTSC  National Television Systems Committee  North America, Japan, Taiwan, and parts of the Caribbean and South America  525 scan lines, 29.97 frames/s/ 4:3 aspect ratio, YIQ color model, interlaced fields PAL  Phase Alternating Line  France, Australia, New Zealand  625 scan lines per frame, 25 frames/s, 4:3 aspect, ratio, YUV color model, interlaced fields SECAM  Système Electronique Couleur avec Mémoire  Soviet Union and Eastern Europe  625 scan lines per frame, 25 frames/s, 4:3 aspect, ratio, YUV color model, interlaced fields (differs from 9 PAL in the color coding scheme

Analog Signal for Video Transmission (NTSC)
525 lines/frame * 29.97 frames/s ≈ 15,734 lines/s Each line must be “swept out” in 1/15,374 secs ≈ 63.6 μsec. The horizontal retrace signal takes 10.9 μsec. This leaves 52.7 μsec for the active line signal giving image data.

Analog Video Signal


Converting Analog Video to Digital

CCIR 601 (one standard for digital video) specifies a standard that applies to both NTSC and PAL. According to the CCIR standard, a frame is sampled to 720 X 480 pixels for NTSC and 720 X 576 for PAL. But this is misleading. There aren’t really 720 pixels per line. The number of samples taken to digitize the video doesn’t necessarily correspond to the number of pixels on the display device. You can do digital video in NTSC format at 640 X 480 or 720 X 480 with different pixel aspect ratios. You can do digital video in PAL format at 720 X 12


CCIR 601 prescribes 4:2:2 subsampling of the chrominance component. This means that there in every 4pixel-square area, 4 luminance samples are taken and 2 of each of the chrominance samples are taken (4 Y’ samples, 2 CB samples, and 2 CR samples). 13


Digital Television, SDTV, HDTV

 

Standard definition television High definition television no 16:9 aspect ratio (1280 X 720 or 1920 X 1080) and Dolby digital surround sound (AC3)

Is HDTV the same thing as digital TV?

Characteristics of HDTV


 

High definition television is not necessarily digital – that is, it does not have to be digitally transmitted. What characterizes HDTV is the aspect ratio, resolution, and sound quality Digital television is not necessarily HDTV. What characterizes DTV is the way in which the data is transmitted – in digital, as opposed to analog, form. HDTV was not originally DTV, but at present most HDTV is digitally transmitted.

Digital Television
  

 

There are 18 different DTV formats Six are also HDTV. Five of these (the DTV formats that are also HDTV) are based on progressive scanning and one on interlaced. Both HDTV and DTV use MPEG-2 Three of the 18 formats for DTV that are used frequently are:
  

480p – 640 X 480 pixels, progressive 720p – 1280 X 720 pixels, progressive 1080i – 1920 X 1080 pixels, interlaced


Digitizing Video

One of the biggest considerations, of course, is file size. Size of file =

frame rate * frame width * frame height * bytes per pixel * number of seconds

For example: 30 f/s * 640 pix * 480 pix * 3 bytes/pix * 60 s = 1,658,000,000 bytes= ~ 1.6 GB


Where does file size create challenges?

In capturing digital video, your hardware and software has to be able to keep up with the data rate. When the file is stored, you have to have enough room on your hard disk. When the file is downloaded and then played, your user has to have the patience to wait for the download. When digital video is played in real-time, the data transmission rate has to be fast enough to keep up with the rate at which the video should be played.

Ways to capture digital video

Copy something that is already in video format (either digital or analog) to your computer. Directly, live, from a digital camera, either analog or video Pick up a live video broadcast signal on your computer.

What equipment do you need to do this?

If source is already digital video, it can be transmitted directly to computer. If the source is recorded analog video, a video capture card must convert analog to digital. Digital camera may digitize and compress before the data is sent to the computer. Connect camera through high speed Firewire (IEEE 1394 interface) or USB.(USB can handle data transfer rates of 1.5 to 480 Mb/s. Firewire can handle up to 800 Mb/s.)

Advantages and disadvantages of digitizing in the camera

Advantage – less noise from transmission.
 

Noise degrades the quality. Noise makes compression more difficult.

Disdvantage – you have to use some standard format and don’t have as much control over compression and data rate.
 

Noise degrades the quality. Noise makes compression more difficult.

 

Compressor/decompressor Hardware or software?

 

Hardware on a video capture board Software – compression done within your video processing program. Decompression done at the user’s computer.


Where and when to compress

In the digital camera, after which the digital stream is passed to the computer through FireWire (IEEE 1394). DV is standard.

DV data rate is ~3.5 MB/s, 720X480 NTSC

 

In the video capture card on the computer, where the video is converted from analog to digital and compressed using the card’s hardware. In a software codec. Compression is usually done twice – once during capture, and again as the24

Compression Strategies

Spatial compression (intra-frame)

Areas that are alike can be grouped in a more concise representation. Record the difference between one frame and another. Look up tables Keep a table of typical patterns Record a small part of the image as an entry into the table

Temporal compression (inter-frame)
  

Downsampling (luminance/chrominance)

Examples of Codecs
    

Cinepak Intel Indeo Sorenson MPEG Even with these codecs, you generally can’t compress video enough so that it can be played in full screen on a midrange computer. You can get about 320X240 at 12 frames per second

 

Gets a good compression rate Decompresses a lot faster than it compresses Uses vector quantization and temporal compression with key frames and difference frames Good for video with a lot of motion

Intel Indeo

About 30% faster than Cinepak at compresses Preserves color well on video with a lot of static scenes Uses vector quantization and temporal compression with key frames and difference frames

 

 

Available as part of QuickTime Newer than the other two. Good quality, good compression rate Can compress to a data rate of 50 KB/s Uses vector quantization and temporal compression with key frames and difference frames Sorenson’s motion compensation method is similar to the one used in MPEG compression

  

An international standard Motion Picture Experts Group MPEG-1, 2, 3, and 4. As the numbers get higher, you get more compression, so the compression method is suitable for more “challenging” material The standard specifies how the data stream is formatted after compression and how it will be decompressed, but not how the compression has to be implemented


 

Released in 1992 Designed for audio/video played mainly from CD-ROMS and hard drives Compression ratio of about 4:1; depends on application Typical data rate of 1.86 Mb/s -- for video that can be stored on CD, VHS quality Progressive scan (can’t handle interlacing or HDTV formats) Typically 320X240 (square pixel format) or 352X240 (SIF – Source Input Format), 29.97 frames per second

Terminology for MPEG Compression
  

Block is 8 X 8 pixels Macroblock is 16 X 16 pixels Progressive scanning displays line after line Interleaved scanning displays odd numbered lines, then even A field is either all the odd numbered lines or all the even numbered lines

MPEG Compression
 

I, P, and B frames are designated A GOP (group of pictures) size is chosen GOP size is usually about 8, 12, or 16 frames


Steps in MPEG compression

I frames are compressed like static images are with JPEG compression
    

4:2:0 or 4:2:2 downsampling DCT Quantization Run-length and Huffman encoding Need about 2 I frames/s for random access

Each P frame is encoded with reference to the previous I or P frame Each B frame is encoded with reference to previous or subsequent frames

Discrete Cosine Transform
p DCT for an N X N pixel image xy ,0 ≤ x < N ,0 ≤ y < N the DCT is an array of coefficients:
1 N −1 DCTuv = Cu C v ∑ x = 0 2N



[ DCTuv ,0 ≤ u < N ,0 ≤ v < N ]
N −1

 (2 x + 1)uπ   (2 y + 1)vπ  ∑ y =0 pxy cos 2 N  cos 2 N     

1 Cu , C v = for u , v = 0 2 Cu , Cv = 1 otherwise


So what is MP-3?

MP3 is audio encoding in the standard of MPEG1 audio Layer 3. There are three audio “layers” possible in MPEG encoding
  

Layer 1 Layer 2 Layer 3

32-448 Kb/s 8-384 Kb/s 8-320 Kb/s

All three layers use psychoacoustical encoding methods

If one frequency component is going to be masked by another one, it doesn’t matter if you drop it 36

 

 

    

Released in 1994 Intended as a coding standard for SDTV and HDTV with data rates of 1.5-60 Mb/s; 15 Mb/s is typical (“Main Profile at Main Level” – MP@ML) Scalable, as compared to MPEG-1 Profiles describe functionality, levels describe resolution Broadcast quality Supports interlaced video 720X480 frame 4:2:2 or 4:2:0 downsampling About 8:1 compression ratio; depends on 37

   

Originally intended for HDTV MPEG-2 was sufficient for HDTV Projected 12:1 compression ratio Not really used much



Standardized in 1998, but still under development Can mix video with text, graphics, and 2-D and 3-D animation layers 5 Kb/s to 4 MB/s


Maximum Data Transfer Rates
 

POTS 28.8-56 Kb/s ISDN (Integrated Services Digital Network) 64-128 Kb/s ADSL (Digital Subscriber Line) 1.5-8.5 Mb/s

 


16-640 Kb/s (upstream) 12.96-55.2 Mb/s 20-40 Mb/s

Speed of Delivery from Storage Devices

   

1x 2x 8x 52x 1x 16x

150 KB/s 300 KB/s 1200 KB/s 7.8 MB/s 1.35 MB/s 21.6 MB/s

 

SCSI Hard Drive

10-20 MB/s (or even as high as 100 MB/s)

Storage Capacity of Storage Devices


700 MB 4.7 GB