You are on page 1of 97

THE DVCAM FORMAT

Clyde Cunningham

© Copyright 2005 Clyde Cunningham


The Sony DVCAM format is based on the
International Electrotechnical Commission
DV standard

IEC 61834
This document consists of ten parts, of which only Parts 1 and 2
are related to standard definition video recording.

The remaining parts define the format for HDTV, EDTV, DVB
and DTV applications.
The only major difference between
DV and DVCAM is Track Pitch.

DV Track Pitch 10µm

DVCAM Track Pitch 15µm


BASIC DVCAM PARAMETERS
Mechanical

Track pitch 15µm


Track angle 9.1752°
Tape speed 28.247mm/s
Tape width 6.35mm
Drum diameter 21.7mm
Track azimuth -20°/+20°
Tape type Metal evaporated
Electrical

Bit rate to tape 41.85 Mb/s


Video bit rate 24.948 Mb/s
Minimum wavelength 0.49µm
Modulation SNRZI PRIV
(24 to 25 bit)
Error correction Reed-Solomon
Cross interleaved
Tracks per TV frame 625/50 12
525/60 10
Video
Lines recorded 625/50 23 - 310
335 - 622
525/60 23 - 262
285 - 524
Sampling structure 625/50 4:2:0
525/60 4:1:1
Compression Intra-frame DCT
Adaptive quantization
Modified 2-D Huffman
Audio

2 Channel mode 16 bit linear/48kHz

4 Channel mode 12 bit non-linear/32kHz

In the DVCAM standard


the audio sampling rate
MUST be locked
to video frame rate
A DVCAM recording where the
audio sampling rate is not locked
to video frame rate is considered
to be NON-STANDARD.

This will be indicated on the front


panel of DVCAM units as –
‘NS’ or ‘Not editable’
DVCAM VIDEO PROCESSING
4:2:2 Video Sampling

ANALO G 4 :2 :2 D IG IT A L
1 3 .5 M H z (4 )

A n a lo g
to 8 b its
Y
D ig ita l
C o n v e rte r
6 .7 5 M H z (2 )

A n a lo g
to 8 b its
C b (B -Y )
D ig ita l
C o n v e rte r
6 .7 5 M H z (2 )

A n a lo g
to 8 b its
C r (R -Y )
D ig ita l
C o n v e rte r
4:2:2 Sampling Structure
Colour Sample Decimation

Y D e la y

O UTPUT
IN P U T Cb D e c im a tio n 4 :1 :1 (5 2 5 /6 0 )
4 :2 :2 F ilte r 4 :2 :0 (6 2 5 /5 0 )

Cr D e c im a tio n
F ilte r
Why is filtering necessary?

Why can’t you just throw away


every alternate colour sample?
S A M P L E D C O L O U R -D IF F E R E N C E S P E C T R U M

4:2:2
3 .3 7 5 M h z 6 .7 5 M h z 1 3 .5 M h z 2 0 .2 5 M h z

S A M P L E D E C IM A T IO N W IT H O U T F IL T E R IN G

4:1:1 A L A IS IN G A L A IS IN G A L A IS IN G A L A IS IN G A L A IS IN G A L A IS IN G

3 .3 7 5 M h z 6 .7 5 M h z 1 0 .1 2 5 M h z 1 3 .5 M h z 1 6 .8 7 5 M h z 2 0 .2 5 M h z

F IL T E R E D C O L O U R -D IF F E R E N C E S P E C T R U M (b e fo r e d e c im a tio n )

4:2:2
1 .6 8 7 5 M h z 6 .7 5 M h z 1 3 .5 M h z 2 0 .2 5 M h z

F IL T E R E D C O L O U R -D IF F E R E N C E S P E C T R U M (a fte r d e c im a tio n )

4:1:1
3 .3 7 5 M h z 6 .7 5 M h z 1 0 .1 2 5 M h z 1 3 .5 M h z 1 6 .8 7 5 M h z 2 0 .2 5 M h z
4:1:1 colour samples decimated HORIZONTALLY.
So Horizontal filtering is necessary.

4:2:0 colour samples decimated VERTICALLY.


So vertical filtering is necessary.
4:2:0 Video Sampling Structure
4:1:1 Video Sampling Structure
4:1:1 4:2:0

Colour Resolution Comparison


Effect of Sample Decimation on Colour Resolution

H o r iz o n t a l c o lo u r r e s o lu t io n

4 :2 :2 4 :1 :1

3M H z 1 .5 M H z
(= 2 4 0 L in e s r e s o lu tio n ) (= 1 2 0 L in e s r e s o lu tio n )

V e r tic a l c o lo u r r e s o lu tio n

4 :2 :2 4 :2 :0

4 0 0 L in e s 2 0 0 L in e s
Summary

4:1:1 Horizontal colour resolution = 120 Lines


4:1:1 Vertical colour resolution = 400 Lines

4:2:0 Horizontal colour resolution = 240 Lines


4:2:0 Vertical colour resolution = 200 Lines
Tape Format
DVCAM Basic Drum Configuration

D ru m S p e e d = 9 0 0 0 rp m

DRUM

1 R e v o lu tio n = 2 T r a c k s
DVCAM Track Footprint

u b c ode
S
m o tio n
o f head
t io n o
D ire c V id e

o
Audi
IT I D ir e c tio n o f ta p e tr a v e l

ITI: Insert Tracking Information


DVCAM RF Waveform
Tracking

The data stream recorded on the tape is encoded


using a technique called ‘24 to 25 bit modulation’.

An extra bit is added to the beginning of every


three randomized (scrambled) bytes.

The value of the extra bit is chosen to shape the


frequency spectrum after NRZI encoding.
Each track is one of the following types -

F0 F1 F2
F0 Track
Recorded spectrum

Level
(d B )

F1 F2
F re q u e n c y (M H z )

F1: Bit rate/90 (465 kHz)


F2: Bit rate/60 (697.5 kHz)
F1 Track
Recorded spectrum

Level
(d B )

F1 F2
F re q u e n c y (M H z )

F1: Bit rate/90 (465 kHz)


F2: Bit rate/60 (697.5 kHz)
F2 Track
Recorded spectrum

Level
(d B )

F1 F2
F re q u e n c y (M H z )

F1: Bit rate/90 (465 kHz)


F2: Bit rate/60 (697.5 kHz)
Tracking Frequency Sequence

F0 F1 F0 F2 F0 F1 F0 F2 F0 F1 F0 F2
Tracking Signal Processing

T r a c k in g S ig n a l
465kH z F1
D e te c t F0 F1 F0 F2 F0
P la y b a c k R F
6 9 7 .5 k H z F2
D e te c t

Note that only one head reads the tracking signals.


During normal playback, the servo system uses
the tracking signals from the entire track.

During insert editing, the servo system uses


only the ITI sector.
Recorded Data Structure
Audio and video data is written to the tape in packets called -

Sync Blocks

Each Sync Block contains -

2 Synchronizing bytes

3 Sync Block identification bytes

77 bytes of data (the payload)

8 bytes of Reed-Solomon Inner Error Correction codes

Total number of bytes in a Sync Block = 90


Sync Block Structure

B y te N u m b e r
0 1 2 3 4 5 81 82 89

7 7 b y te s 8 b y te s

D a ta b y te s
In n e r E r r o r C o rre c tio n C o d e s
ID ( S y n c B lo c k id e n t if ic a tio n )
S y n c (S y n c h r o n iz in g b y te s )

One Sync Block = 0.2mm (approx)


Audio Product Block
B y te N u m b e r
0 4 5 9 10 81 82 89

A u d io
A u x ilia r y A u d io D a ta
1 4 D a ta S y n c -b lo c k s

D a ta
In n e r
R eed-
S o lo m o n
Sync

ID Codes

O u te r
Reed-
S o lo m o n
C odes
Video Product Block

B y te N u m b e r
0 4 5 9 10 81 82 89

V id e o A u x ilia r y D a t a
1 4 9 D a ta S y n c -b lo c k s

V id e o D a ta
In n e r
R eed-
S o lo m o n
Sync

ID Codes

V id e o a u x ilia r y d a ta

O u te r
R eed-
S o lo m o n
Codes

S T R U C T U R E O F T H E S Y N C -B L O C K S IN T H E V ID E O S E C T O R
Video Data Rate Reduction
8-bit serial data rate 216 Mb/s

Colour resolution halved 162 Mb/s

H and V blanking removed 125 Mb/s

DV target data rate 25 Mb/s

Compression ratio 5:1


Video Data Rate
Reduction Processes
First Stage
of
Video Compression

Blocking

Blocking is the partitioning of the Y, Cb and Cr


samples of the TV frame into 8x8 pixel blocks.
414720 Y pixels
103680 Cb pixels
103680 Cr pixels

6480 Y blocks
1620 Cb blocks
1620 Cr blocks
Each Y, Cb and Cr pixel block is then subjected to a mathematical
process called a Discrete Cosine Transform (DCT).

DCT transforms the 8x8 pixel blocks from time-domain


information into space-frequency domain information.

The purpose of this is to make use of the statistical nature of TV


images to reduce the amount of data representing the picture.
Time Domain Space-frequency Domain

DCT

8x8 pixel blocks 8x8 coefficient blocks

(DCT: Discrete Cosine Transform)


The numbers in the space-frequency block represent the
DCT coefficients (or proportions) of 64 Basis Pictures.

Basis Pictures

The Basis Pictures are all unique - no two make up a third.


DCT Basis Pictures
DC coefficient Increasing horizontal detail

Increasing vertical detail

AC coefficients
Typical DCT Coefficients For One DCT Block

Note that the low frequency coefficients have the largest values and
the high frequency coefficients have the lowest values.

Statistically this is the most common situation in a TV image.


Note also that, whereas all the time-domain values are finite,
many of the DCT coefficient values are zero.

In other words, the picture appears to be more efficiently


represented by DCT coefficients than by time-domain values.
However, some of the DCT coefficient values will be large because
the sum of the squares (power) of the time-domain values equals
the sum of the squares of the space-frequency domain values.
If the image to be compressed is a colour bar test waveform, then almost all the DCT
coefficients will be zero.

Using an 8x8 area of the green bar as an example, note that there is no horizontal
detail and no vertical detail. So the DCT Block will have the following values -

149 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

The indicated area of the colour bar is represented by one finite number only.
But there is a problem!
When there is horizontal movement between the fields
in a frame, large vertical detail can be generated.
Vertical detail

F1 F2

No movement between fields Movement between fields

Large vertical high-frequency DCT coefficients are generated


if the image moves horizontallly.
When significant movement is detected between the fields,
each 8x8 pixel block is broken into two 8x4 blocks as follows -

Original 8x8 pixel block Modified 8x8 pixel block


A A+B
B C+D
C E+F
D G+H
E A-B
F C-D
G E-F
H G-H

The DCT transformation is then performed over the entire


modified 8x8 pixel block.
When there is no movement between the fields the processing
mode is called -

8x8 DCT Mode

When there is ‘significant’ movement between the fields the


processing mode is called -

2-4-8 DCT Mode

This information is transmitted with the coefficient data so that


the pixel values can be reconstructed correctly in the decoder.
Up to this point
all the processes are
completely reversible.
Quantization
(scaling the coefficients)

This is NOT completely reversible.

Causes compression artefacts.


Quantization is the scaling of the AC coefficients in the DCT Blocks.

This is done to reduce the values of the coefficients, because


lower values (most probable) are transmitted with a small number
of bits and higher values (least probable) are transmitted with a
larger number of bits (called Variable Length Coding). So reducing
the values of the coefficients reduces the total number of bits
representing the data in the DCT block

Quantization is done using a Quantization Table, which is chosen


according to the absolute magnitude of the largest AC coefficient
and the visibility of the errors after quantization has been
performed.

There are 16 Quantization Tables in the DV format.


The numbers in blue come from the quantization table for this DCT block.
Note that the DC coefficient is not quantized.

Also note that the quantization steps for the high spacial-
frequency components are larger than than the steps for low
spacial-frequency components
Quantized Coefficients

Note that the values are truncated (e.g. 37/2 = 18)


At the decoder the quantized coefficients in each DCT block are
multiplied by the same quantization steps as used in the encoder.

Because of truncation, some re-quantized coefficients will have errors.

Compression artefacts (or ‘mosquitos’) are the result of these errors.

The high frequency coefficients have the greatest errors because the
quantization steps are more severe for high frequency coefficients
than for low frequency coefficients.
Original Coefficient Values Re-quantized Coefficient Values

The numbers in red indicate which of the re-quantized


coefficients have errors.
Zig-Zag Scan
Advantage can be taken of the large number of zeros in the
quantized DCT coefficient block.

By scanning the block in a zig-zag pattern the sequence of


zero coefficients can be encoded more efficiently.

This is called Run-Length encoding.


Zig-Zag Scan

120 83 59 18 6 2 0 0

97 75 -2 1 11 3 2 0 0

43 15 8 7 -2 0 0 0

11 5 4 2 1 0 0 0

3 3 2 -1 0 0 0 0

1 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

120, 83, 97, 43, 75, 59, 18, -21, 15, 11, 3, 5, 8, 11, 6, 2, 3, 7, 4, 3, 1, 0, 1, 2, 2,
-2, 2, 0, 0, 0, 0, 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0
In the DV system, the binary code word representing a ‘run’ is
determined by the number of repeated zero coefficients and the
absolute amplitude of the coefficient immediately following the
‘run’.

Statistically the amplitude of a coefficient at the end of a ‘run’ is


more likely to be small rather than large so the length of the binary
code words representing the ‘runs’ are chosen to have fewer bits
for small amplitudes and more bits for large amplitudes.

This is called Modified 2-D Huffman coding.


In the previous case the 64 DCT coefficients are reduced to 31
codewords.

120, 83, 97, 43, 75, 59, 18, -21, 15, 11, 3, 5, 8, 11, 6, 2, 3, 7, 4, 3, 1,
0, 1,
2, 2, -2, 2,
0, 0, 0, 0, 1,
-1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0

The numbers in red are represented by one binary code word only.
Macro Blocks, Video Segments
and
Super Blocks
Because there is limited data-space on the tape, it is necessary to
make sure that the space is used efficiently.

Some areas of the TV frame will have few high frequency


components so the DCT blocks representing those areas will have
few finite AC coefficients.

If each DCT block was allocated a fixed number of bytes on the


tape then much of the data-space would be wasted.

The following techniques are used to make sure that the space is
used as efficiently as possible -
Macro Blocking

Cb
Cr
Y

6 2 5 /5 0 M a c r o B lo c k

One Macro Block = Four Y DCT Blocks


+ One Cb DCT Block
+ One Cr DCT Block
(6480 Macro Blocks = one TV Frame)
Video Segment

TV Frame
One Video Segment = Five pseudo-randomly selected Macro Blocks

(1296 Video Segments/TV Frame)


One Video Segment
is compressed into
5 x 77 = 385 bytes = 3080 bits
7 7 b y te s 7 7 b y te s 7 7 b y te s 7 7 b y te s 7 7 b y te s

77 bytes = One Compressed Macro Block


= the data payload of a Video Sync Block

Excess data from one Compressed Macro Block


is passed to vacant spaces in other blocks
within the same video segment.
Super Blocks

C o m p r e s s e d M a c r o B lo c k

0 5 6 11 12 17 18 23 24
1 4 7 10 13 16 19 22 25
2 3 8 9 14 15 20 21 26

T V F ra m e

One Super Block = 27 Compressed Macro Blocks

Numbers indicate the order of transmission


Track Distribution of Super Blocks
Tra c k
Num ber T o p o f P ic t u r e

0 S u p e r B lo c k (0 ,0 ) S u p e r B lo c k (0 ,1 )

O r d e r o f R e c o r d in g 1 S u p e r B lo c k (1 ,0 ) S u p e r B lo c k (1 ,1 )

10

11

B o tto m e d g e o f th e ta p e T o p e d g e o f th e ta p e

D ir e c tio n o f H e a d M o tio n
Effect of a Head Clog

DVCAM Digital Betacam


Clogged DVCAM Heads
Effect of a Dirty Head
Effect of Poor Tracking
Effect of Low Tape tension
Dirty capstan = poor tape handling = poor tracking
Dirty drum = poor recording and poor playback
Dirty drum = poor recording and poor playback
Dirty guide = poor tracking
Worn pinch roller
Customer’s DSR-20P deck!
The Cassette Contacts
Some cassettes have contacts on the rear edge of the shell -

Cassettes with contacts may or may not contain a memory chip.


Cassettes With Memory
Function of Contacts

1 2 3 4

Contact 1 VDD (2.7 to 5.5V)


Contact 2 SDA (Serial Data In/Out)
Contact 3 SCK (Serial Data Clock)
Contact 4 GND
Cassettes With Memory
Function of Memory

The memory chip inside a Sony cassette has a capacity


of 16kbit (2kbytes).

The memory area from bytes 0 to 15 is called the ‘Main


Area’.

The remainder of the memory area is called the ‘Optional


Area’.
Cassettes With Memory

APM BCID Byte 0 APM Application of Memory = 111 for a new cassette

BCID Basic Cassette Identification –


Cassette ID
Tape grade (consumer/non-consumer)
Main Area

Tape type (metal evaporated/metal particle)


Tape Length Tape thickness (7 micron/other)

Cassette ID VCR/non-VCR

Byte 15
Optional Area

ClipLink
TC data
Cassettes Without Memory
Basic cassette information is identified by resistors connected
between Pin 4 and Pins 1, 2 and 3 –

Pin 4 – 1 Open circuit 7 micron tape


1.8 kohm Other

Pin 4 – 2 Open circuit Metal evaporated tape


1.8 kohm Cleaning cassette
Short circuit Metal particle tape

Pin 4 – 3 Open circuit Consumer grade tape


6.8 kohm Non-consumer grade tape
Cassettes Without Contacts

These are read as 7 micron/Metal Evaporated/Consumer tapes.


DVCAM Heads
DVCAM Record Process
A n a lo g u e
V id e o In p u ts
S V id e o
C o m p o s ite
Com ponent In p u t
V id e o 4 :2 :2 D e c im a tio n 4 :2 :0 V id e o O u te r In n e r 2 4 -2 5 B it SNRZI
D ig ita l S e le c t Tap e
F ilte r C o m p re s s ECC ECC M o d u la to r Encoder
V id e o In p u ts
SDI
QSDI

A n a lo g u e
A u d io
In p u ts
C h -1
C h -2
C h -3
C h -4 In p u t
A u d io O u te r
S e le c t ECC
D ig ita l
A u d io
In p u ts
C h -1 /2
C h -3 /4

DV
i.L IN K In te r fa c e
DVCAM Playback Process A n a lo g u e
V id e o O u tp u ts
S V id e o
C o m p o s ite
V ite rb i In n e r O u te r V id e o V id e o Com ponent
Tap e 4 :2 :0 In te r p o la to r 4 :2 :2
D ecoder E rro r E rro r D e -C o m p re s s O u tp u t
D e te c t D e te c t P ro c e s s
D ig ita l
V id e o O u tp u ts
SDI

QSDI
A n a lo g u e
A u d io
O u tp u ts

C h -1
C h -2
O u te r A u d io C h -3
E rro r O u tp u t C h -4
D e te c t P ro c e s s
D ig ita l
A u d io
O u tp u ts
C h -1 /2
C h -3 /4

DV
i.L IN K
In te r fa c e
THE
XH2-1AST
TRACKING ALIGNMENT TAPE
T r a c k in g in fo r m a tio n N o t r a c k in g in f o r m a t io n

EVEN HEAD ODD HEAD ODD TRACK EVEN TRACK

EVEN

ODD
DVCAM EQUALISATION
ADJUSTMENTS
DVCAM EQUALISATION ADJUSTMENTS

Phase Cos AGC Delay


Data
Clock
PLL Delay

RF from Data
Out
PB head Phase Cosine Viterbi
AGC ADC
Equaliser Equaliser Decoder

Phase Amplitude

Frequency Frequency

You might also like