Professional Documents
Culture Documents
Initially a PSTN operated with analogue signals throughout, the source speech signal being transmitted and switched
However, today these have been replaced with digital circuits In order to support interworking of the analogue and digital circuits the design of the digital equipment is based on the analogue network operating parameters The BW of a speech circuit was limited to 200 Hz to 3.4 kHz The digitization procedure is known as pulse code modulation PCM (pulse code modulation) is a digital scheme for transmitting analogue data.
PCM Speech Compressor/Expander Characteristics In linear quantization irrespective of the signal amplitude same level of quantization noise is produced ( noise level is same for the quiet signals and loud signals)
Pulse Code Modulation consists of two additional circuits: Compressor (encoder) and Expander (decoder) to help reduce the effect of quantization noise with just 8 bits per sample, making the intervals non-linear with narrower intervals for small amplitude signals than larger amplitude signals. This is achieved by the means of the compressor circuit The analogue output from the DAC is passed to the expander circuit which performs the reverse operation of the compressor circuit
The overall operation is known as companding The compression and expansion characteristics is known as A-law in Europe
Synthesized audio
The computer takes input commands from the keyboard and outputs these to the sound generators which produce the corresponding sound waveform to drive the speakers
Synthesized audio is often used since the amount of memory required can be between two or three orders of magnitude less than that required to store the equivalent digitised waveform version
The three main components of an audio synthesizer are the computer (with various application program), the keyboard (based on that of a piano) and the set of sound generators The computer takes the commands and outputs these to the sound generators which in turn produce the corresponding sound waveform via DACs to drive the speakers
Synthesized Audio
Synthesized Audio
Pressing a key has similar effects to pressing a keyboard of a computer. For each key press a different codeword (message indicating the key pressed and the pressure applied) is generated
The control panel contains range of different switches and sliders that collectively allow the user to indicate to the program information such as the volume of the generated output and selected sound effects to be associated with each key To discriminate between the inputs from different possible sources a standard known set of messages (also includes the type of connectors, cables, electrical signals, etc) have been defined: Music Instrument Digital Interface (MIDI)
Status byte - This defines the particular event that has caused the message to be generated
Data bytes Which collectively define a set of parameters (pressure applied, identity of the key) associated with the event Event A key being pressed It is important to identify the different types of instruments that generated the events Each instrument has a MIDI code associated with it e.g Piano has a code of 0 and violin 40 Since the music is in the form of MIDI messages it is vital to have a sound card in the client computer to interpret the sequence
MIDI
The three main properties of a colour source that the eye makes use of are:
- Brightness: represents the amount of energy that
Colour Signals
Luminance is used to refer to the brightness of a source, and hue and saturation (concerned with its colour) are referred to as chrominance characteristics
The combination of the three signals Y ( amplitude of luminance signal), Cb (blue chrominance) , and Cr (red chrominance) contains all the necessary information to describe a colour signal
I signal bandwidth 2 MHz Q signal bandwidth 1 MHz In NTSC the eye is more responsive to the I signal than the Q signal, hence maximizing the available bandwidth and minimizing the level of interference with the luminance signal is needed
In PAL, the larger luminance bandwidth allows both the U and V chrominance signals to have the same modulated bandwidth
The other two systems NTSC and PAL transmit both chrominance components simultaneously using a technique known as Quadrature amplitude modulation (QAM)
Digital Video
With digital television it is more usual to digitize the three component signals separately prior to their transmission to enable editing and other operations to be readily performed Since the eye is less sensitive for colour than it is for luminance, a significant saving in terms of resulting bit rate can be achieved by using the luminance and two colour difference signals instead of the RGB directly Digitization formats exploit the fact that the two chrominance signals can tolerate a reduced resolution relative to that used for the luminance signal
One way is to sample the chrominance components every other pixel known as the 4:2:2 sampling structure
This reduces the chrominance resolution in the horizontal dimension only leaving the vertical resolution unaffected The ratio 4:2:2 (Y: CR: CB) indicates that both CR and CB are sampled at half the rate of the luminance signal
Bandwidth up to 6MHz for the luminance signal and less than half this for the chrominance signals
4:2:0 Format
It is a derivative of the 4:2:2 format and is used in digital video broadcast applications (achieving good picture quality)
Digital Processing
Logic Gates -A logic gate is a device whose output depends on the combination of its inputs - For instance, an AND gate produces a logic 1 (high) output if and only if all its inputs are high
Shift Registers
A shift register is a temporary store of data, which may then be sent out in a serial or parallel form SISO shift register b0 b1 b2 b3 Serial data in b7 Serial data out
When the register is full, the stored data in the register may then be clocked out serially, bit by bit This type of register is called a serial-in-serial-out (SISO) shift register The other types of registers are serial-in-parallel-out (SIPO) and parallel-in-serial-out (PISO)
Multiplexing
Communication invariably involves transmitting several programmes via the same communication media, such as cable, satellite or terrestrial links
Multiplexing
sequentially
Each programme is allocated a time slot during which the whole of the bandwidth of the medium is made available to it At the receiving end the transmitted data is demultiplexed to obtain the required programme
Multiplexing
TDM is most efficient if all programmes carry the same amount of data If they do not, i.e. if the traffic is uneven, some time slot will be underutilized while other time slots may not be able to handle the data stream
Statistical multiplexing
In this technique the allocation of time slots is based on the amount of traffic each programme generates Time slots are allocated according to need Programmes that generate heavy traffic are allocated more time slots while those with lighter traffic are allocated fewer time slots
In all types of communication system, errors may be minimized but they cannot be avoided completely, hence the need for error correction techniques If an error is detected at the receiving end, it can be corrected in two different ways: - the recipient can request the original transmitter for a repeat of the transmission - or the recipient can attempt to correct the errors without any further information form the transmitter Whenever possible the communication systems tend to go for retransmission However if the distances are large, perhaps to contact a space probe, or if real time signals are involved then retransmission is not an option. These cases require error correction techniques
1 1 1 1
1 0 1 0
1 0 1 0
0 1 0 1
1 0 1 0
1 0 1 0
0 0 0 0
1 0 0
Odd parity 1
Even parity is when the complete coded data including the parity bit contains an even number of 1s odd parity is when the complete coded data contains an odd number of 1s At the receiver end the number of 1s is counted and checked against the parity bit. A difference indicates an error
Suppose the invalid code word 011 is received; it can be corrected because it is most likely intended to be 010. It could have been 011 with two bits corrupted but the probability of that happening is less likely
The Joint Photographic Experts Group forms the basis of most video compression algorithms
2-D matrix is required to store the required set of 8-bit grey-level values that represent the image
For the colour image if a CLUT is used then a single matrix of values is required If the image is represented in R, G, B format then three matrices are required If the Y, Cr, Cb format is used then the matrix size for the chrominance components is smaller than the Y matrix ( Reduced representation)
Once the source image format has been selected and prepared (four alternative forms of representation), the set values in each matrix are compressed separately using the DCT)
Since the values in all the other locations of the transformed matrix have a frequency coefficient associated with them they are known as AC coefficients
Using DCT there is very little loss of information during the DCT phase
The losses are due to the use of fixed point arithmetic
The main source of information loss occurs during the quantization and entropy encoding stages where the compression takes place The human eye responds primarily to the DC coefficient and the lower frequency coefficients (The higher frequency coefficients below a certain threshold will not be detected by the human eye)
This property is exploited by dropping the spatial frequency coefficients in the transformed matrix (dropped coefficients cannot be retrieved during decoding)
In addition to classifying the spatial frequency components the quantization process aims to reduce the size of the DC and AC coefficients so that less bandwidth is required for their transmission (by using a divisor)
The sensitivity of the eye varies with spatial frequency and hence the amplitude threshold below which the eye will detect a particular frequency also varies The threshold values vary for each of the 64 DCT coefficients and these are held in a 2-D matrix known as the quantization table with the threshold value to be used with a particular DCT coefficient in the corresponding position in the matrix
The choice of threshold value is a compromise between the level of compression that is required and the resulting amount of information loss that is acceptable
JPEG standard has two quantization tables for the luminance and the chrominance coefficients. However, customized tables are allowed and can be sent with the compressed image
Vectoring The entropy encoding operates on a onedimensional string of values (vector). However the output of the quantization is a 2-D matrix and hence this has to be represented in a 1-D form. This is known as vectoring
number of zeros in the run and value is the next non-zero coefficient
- the quantization table of values that have been used to encode each component
Each scan comprises one or more segments each of which can contain a group of (8X8) blocks preceded by a header This contains the set of Huffman codewords for each block
The values are first centred around zero by substracting 128 from each intensity/luminance value
Block preparation is necessary since computing the transformed value for each position in a matrix requires the values in all the locations to be processed
In order to exploit the presence of the large number of zeros in the quantized matrix, a zig-zag of the matrix is used
A JPEG decoder is made up of a number of stages which are simply the corresponding decoder sections of those used in the encoder
JPEG decoding
The JPEG decoder is made up of a number of stages which are the corresponding decoder sections of those used in the encoder
The frame decoder first identifies the encoded bitstream and its associated control information and tables within the various headers It then loads the contents of each table into the related table and passes the control information to the image builder
Then the Huffman decoder carries out the decompression operation using preloaded or the default tables of codewords
JPEG decoding
The two decompressed streams containing the DC and AC coefficients of each block are then passed to the differential and run-length decoders
The resulting matrix of values is then dequantized using either the default or the preloaded values in the quantization table Each resulting block of 8X8 spatial frequency coefficient is passed in turn to the inverse DCT which in turn transforms it back to their spatial form
The image builder then reconstructs the image from these blocks using the control information passed to it by the frame decoder
JPEG Summary
Although complex using JPEG compression ratios of 20:1 can be obtained while still retaining a good quality image This level (20:1) is applied for images with few colour transitions For more complicated images compression ratios of 10:1 are more common Like GIF images it is possible to encode and rebuild the image in a progressive manner. This can be achieved by two different modes progressive mode and hierarchical mode
JPEG Summary
Progressive mode First the DC and low-frequency coefficients of each block are sent and then the highfrequency coefficients
hierarchial mode in this mode, the total image is first sent using a low resolution e.g 320 X 240 and then at a higher resolution 640 X 480