You are on page 1of 15

The basics of Discrete Cosine Transform has been discussed.

The articles have been collected from various sources.

The Discrete Cosine Transform (DCT)
The discrete cosine transform (DCT) helps separate the image into parts (or spectral sub-bands) of differing importance (with respect to the image's visual quality). The DCT is similar to the discrete Fourier transform: it transforms a signal or image from the spatial domain to the frequency domain (Fig 7.8).

DCT Encoding The general equation for a 1D (N data items) DCT is defined by the following equation:

and the corresponding inverse 1D DCT transform is simple F-1(u), i.e.: where

The general equation for a 2D (N by M image) DCT is defined by the following equation:

f(i. F(u.: where The basic operation of the DCT is as follows:         The input image is N by M. much of the signal energy lies at low frequencies. these can range from -1024 to 1023. .   The output array of DCT coefficients contains integers. It is computationally easier to implement and more efficient to regard the DCT as a set of basis functions which given a known input array size (8 x 8) can be precomputed and stored. and are often small .v) is the DCT coefficient in row k1 and column k2 of the DCT matrix. these appear in the upper left corner of the DCT. This array contains each pixel's gray scale level. i.0]? answer: They define DC and AC components.and the corresponding inverse 2D DCT transform is simple F-1(u.small enough to be neglected with little visible distortion. 8 bit pixels have levels from 0 to 255. The values as simply calculated from the DCT formula. This involves simply computing values for a convolution mask (8 x8 window) that get applied (summ values x pixelthe window overlap with image apply window accros all rows/columns of image).j) is the intensity of the pixel in row i and column j.9.e. The 64 (8 x 8) DCT basis functions are illustrated in Fig 7. Therefore an 8 point DCT would be: where Question: What is F[0. For most images. Compression is achieved since the lower right values represent higher frequencies. The DCT input is an 8 by 8 array of integers.v).

10) DCT/FFT Comparison  o  Computing the 2D DCT Factoring reduces problem to a series of 1D DCTs (Fig 7.DCT basis functions  Why DCT not FFT? DCT is similar to the Fast Fourier Transform (FFT).11): apply 1D DCT (Vertically) to Columns . but can approximate lines well with fewer coefficients (Fig 7.

and Signal Processing 1989 (ICASSP `89). The equations are given by: o o Most software implementations use fixed point arithmetic. World record is 11 multiplies and 29 adds. Loeffler. Images do however tend to compact their energy in the frequency domain making compression in the frequency domain much more effective. Speech. 988-991) DISCRETE COSINE TRANSFORM Out of the image compression techniques available. The Discrete Cosine Transform (DCT) is an example of transform coding. Conf. The current JPEG standard uses the DCT as its basis. (C. Int'l. The DCT coefficients are all real numbers unlike the Fourier Transform. DCT: . The DC relocates the highest energies to the upper left corner of the image. Transform coding is simply the compression of the images in the frequency domain. The DCT is fast. "Practical Fast 1-D DCT Algorithms with 11 Multiplications". the coefficients must not allow for the loss of any information. or alternatively Horizontal to Vertical. The lesser energy or information is relocated into other areas. transform coding is the preferred method. Some fast implementations approximate coefficients so all multiplies are shifts and adds. Transform coefficients are used to maximize compression. Proc. It can be quickly calculated and is best for images with smooth edges like photos with human subjects. pp. compression in the spatial domain is not an easy task. The Inverse Discrete Cosine Transform (IDCT) can be used to retrieve the image from its transform representation. A.  apply 1D DCT (Horizontally) to resultant Vertical DCT above. Ligtenberg and G. on Acoustics. For lossless compression. Moschytz. Since energy distribution varies with each image.

was then formed in 1986 in order to avoid competing standards among the three standards organizations. JPEG has been in existence for nearly a decade. The DCT is the transform used in JPEG compression. It is said that the core of JPEG 2000 is Wavelet technology. the International Telegraph and Telephone Consultative Committee (CCITT). Revisions updating JPEG to make use of our current text-based technologies are in progress. The Joint Photographic Experts Group. The project began under ISO as Working Group 8 but later merged with CCITT. It was drafted as the ISO Committee Draft 10918 or Digital Compression and Coding of Continuous-Tone Still Images. The release date has been set for January 2000. DCT ● A technique for converting signal into elementary frequency components ●Why we need compression? . The standard was intended for natural. The project team isdeveloping a JPEG format that provides more compression options and better images which take up the same amount of space. the Adaptive Discrete Cosine Transform (DCT) was chosen to be the core of the JPEG format.IDCT: HISTORY OF JPEG "Joint Photographic Experts Group" is the original name of the committee that created the JPEG format. The JPEG project began back in 1982. actually a subcommittee of ISO. It was designed to compress natural pictures that are smooth and curved and have no jagged edges. the format would be able to send loss-less images. and the International Electrotechnical Commission (IEC). real world scenes. After testing of numerous schemes. The goal was to create a data compression standard that would display an image within one second down a 64 Kbits/sec ISDN line. but implementation will probably take some time. Eventually. This project has been in progress since August 1998. It was officially standardized as the International Standard ISO 10918-1. the merged ISO/IEC committee gave their approval to make the JPEG the standard. The standard was a joint effort by three of the world's largest standards organizations: The International Organization for Standardization (ISO). Three years later.

The need for sufficient storage space. and long transmission time for image. For analysis of two-dimensional (2D) signals such as images. we need a 2D version of the DCT . and video data Principles behind compression ● Redundancy reduction Aims at removing duplication from the signal source ● Irrelevancy reduction It omits parts of the signal that will not be noticed by the signal receiver. The general equation for a 1D (N data items) DCT is defined by the following equation: The one-dimensional DCT is useful in processing one-dimensional signals such as speech waveforms. large transmission bandwidth. audio. One-Dimensional Discrete Cosine Transform The DCT can be written as the product of a vector (the input list) and the n x n orthogonal matrix whose rows are the basis Vectors. We can find that the matrix is orthogonal And each basis vector corresponds to a sinusoid of a certain frequency.

Entropy Coding (EC) achieves additional compression losslessly by encoding the quantized DCT coefficients more compactly based on their statistical characteristics. which contains a significant fraction of the total image energy. is differentially encoded. DCT introduces no loss to the source image samples. or are compressed by alternately interleaving 8x8 sample blocks from each in turn. image quality degrades because of the artifacts resulting from the block-based DCT scheme. While the DCT-based image coders perform very well at moderate bit rates. The DC coefficient. ● Assuming a periodic input. the magnitude of . For a typical 8x8 sample block from a typical source image. Advantages and Disadvantages ● The DCT does a better job of concentrating energy into lower order coefficients than does the DFT for image data ● The DCT is purely real. the DFT is complex. DCT merely transforms them to a domain in which they can be more efficiently encoded. at higher compression ratios. most of the spatial frequencies have zero or near-zero amplitude and need not be encoded.DCT-based are either compressed entirely one at a time.

jsp?tp=&arnumber=4449470&isnumber=4479597 screte Cosine Transform The DCT transforms a signal from a spatial representation into a frequency representation. and Jingjing Fu http://ieeexplore.JPEG does this by dividing the coefficients by a quantization matrix in order to get long runs of zeros) Compress the quantized coefficients using a lossless method (RLE.seecs. etc) The formulae for a 2D DCT: . IEEE. throw away unimportant information to reduce the filesize .ieee.http://www.edu.e.pdf 4. Directional Discrete Cosine Transforms—A New Framework for Image Coding by Bing Zeng.http://www. Arithmetic coding.the DFT coefficients is spatially invariant . An oversimplified JPEG compressor:     Cut an image up into chunks of 8x8 pixels Run each chunk through an 8x8 2D DCT Quantize the resulting coefficients (i. Lower frequencies are more obvious in an image than higher frequencies so if we transform an image into its frequency components and throw away a lot of higher frequency coefficients. we can reduce the amount of data needed to describe the image without sacrificing too much image quality.html 2.com/watch?v=hgr5O0du-sg 3.pk/publications/tech_reports/DCT_TR802.http://wisnet. Huffman.uk/Dave/Multimedia/node231.org/stamp/stamp. Member. This is not true for the DCT REFERNECES: 1.cf.youtube.ac.cs.

we use the IDCT formulae: .Four example blocks in spatial and frequency domains: >>> Spatial Frequency Inverse Discrete Cosine Transform To rebuild an image in the spatial domain from the frequencies obtained above.

the DCT is perfectly reversable and we do not lose any image definition until we start quantizing coefficients.Mathematically. A better quantizer would decrease precision gradually instead of simply zeroing out components. In my simulation. Original image 4x4 (25%) 3x3 (14%) 2x2 (6%) Do those artifacts look familiar? . I simply threw most of them out. Below is the original image and reconstructions of it using only the most significant n x n coefficients.

Even a naïve implementation of the separated DCT.Discrete Cosine Transform and Quantisation The Advanced Image Coding codec uses the Discrete Cosine Transform (DCT) to transform a 8x8 residual block into a set of coefficients to cosine functions with increasing frequencies. will run faster than a single-pass 2D DCT. The formula: This means that instead of performing a 2D DCT. see the Resources and links section. we can perform a 1D DCT on each row and then on each column of the block we're processing. Below you will find a short introduction to the DCT. The AAN (Arai/Agui/Nakajima) algorithm is one of the fastest known 1D DCTs. such as the one in listing2. 1-Dimensional DCT The 1-dimensional Forward DCT transforms a row of 8 residual values (V) into a row of 8 coefficients(C): . AIC . listing3. For more detailed information there are plenty of resources available on the internet.c.c shows a still faster 2D DCT built from 1D AAN DCTs.Optimization A 2D DCT can be evaluated more quickly using a series of 1D DCTs.

The following figure shows the result of the FDCT. The later coefficients are much smaller than the first ones. The following graphs clarify this: . which might indicate that these do not contribute much to the image quality. only positive values are used here) Sample row Residual values DCT coefficients 121 61 58 -175 61 43 113 49 171 37 200 10 226 5 246 5 The resulting DCT coefficients don't look better compressible at all. applied to a sample row. The residual values are represented using shades of gray (for clarity. however. can be seen in the decreasing magnitude of the coefficients. The clue to compression.

an additional coefficient (AC coefficient) is added to the reconstruction. By using only 4 of the 8 coefficients. In the first graph. which means that the DC coefficient represents the average of an entire row of values. the subsequent improvements become smaller and less noticeable. The first graph shows a straight line through the average residual values. the reconstructed line is already close to the original. only the first DCT coefficient (the DC coefficients) is used to reconstruct the values. By adding AC coefficients. detail is added to the image and the blue line moves closer to the original red dots. In each following graph. This is the key to compression. The blue line represent the reconstructed residual values after performing an Inverse DCT on several DCT coefficients. Since the higher-order (high-frequency) DCT coefficients . also shown at the bottom of each graph.The red dots represent the original residual values. From the fifth graph onward.

and AIC. they can be discarded while still producing a close approximation of the original.contribute less information to the image. use quantisation. This will make the coefficients better compressible. When you choose a quality level in JPEG or AIC. Quantisation But completely discarding coefficients is not always desirable. which is just a fancy word for dividing in this context. Several tests have shown that uniform quantisation is more appropriate in this case. 2-Dimensional DCT The 1-dimensional DCT discussed above only takes advantage of correlation between residual values in a row. This is why JPEG. but also reduces image quality because the coefficients cannot be reconstructed faithfully. It also uses a nonuniform quantisation method by which high frequency coefficients (the later coefficients) are quantised with higher values than low frequency ones. JPEG uses the DCT to transform pixel values instead of residual values. The higher this value. In certain types of images with high contrast. This is done by performing a 2dimensional DCT on a block of 8x8 residual values: However. . the smaller the results will be. the high frequency coefficients are important to the image detail and cannot be discarded. like textual images or cartoons. AIC performs the DCT on residual values. you actually set the amount of quantisation used. In AIC. Each coefficient gets divided by a certain value. all coefficients are quantised by the same value. followed by a 1D DCT on all columns from the result of the first step. This is much faster than implementing a 2D DCT directly. Better compression can be achieved when we take both the horizontal and vertical correlation between residual values into account. a 2D DCT can also be implemented by first applying a 1D DCT on all rows.

Agui and Nakajima) algorithm is used to calculate the DCT. so there is no need to reorder the DCT coefficients. AIC only uses the floating point algorithm since it produces the highest quality images. The CABAC codec does not use run length encoding. . This reference software supports different algorithms. So in AIC. To speed up the calculations. in the last step. but on modern computers. In JPEG the DCT coefficients are transmitted in zigzag order to form runs of zeros which can be encoded using run length encoding. floating point calculations are performed much faster than in the old days. the AAN (Arai.The DCT code used in the AIC codec is based on the code in the JPEG reference software from the Independent JPEG Group. the coefficients are transmitted in scan line order. the DCT coefficients and prediction modes are encoded to the stream using Context Adaptive Binary Arithmetic Coding(CABAC). Finally. It's a bit slower than the other algorithms.

:920/.

34/0 92  995.

.

 4:9:-0 .42.

7 /: 8  995...9.

.

8309 800.8 0/: 5.

.9438.5:-.

90.*7054798.

8.3814728 07.943.0 4/3-303 02-07  .70904830%7.20471472.%*%#  5/1  70.3/33: 995.

.

0005470 000 47.

89.25.

090.9.25 8595 .425430398.32..705708039.943394.9  34.3/974.1706:03.0782510/!.2.094:98.4:83.0 6:.0394981706:03.4941071706:03.73:2-07 83:2-07  _U^W`W[_ZW^SZ_X[^ %0%97.9.300/0/94/08.39442:.4011.83.3 2.3071706:03..24:3941/.08.3814728.425708847 .381472.943  4071706:03.0398 0 .89.1742.08841097.7024704-...71.705708039.09.85.7-0902..370/:..

f°¯f– ½°n°¾€½ ¾ ° fnn°–f°.

@ .f°   ¾°–n €€n °¾% ff°¯½f°°€¯f° n   € ¾  9  ¾¾   °– n €€n °¾ ff°f°¯f° – °– °¾€ ¾% O .

0147.% .¯½ ¾¾ f° n €€n °¾¾°–f¾¾ ¾¾¯  % €€¯f° ¯ nn °–  n% O O O %01472:.

0 0:8090% 1472:.084-9..0    .   4:70.250-4.9.039085.3/1706:03.9.30/.31742901706:03.32.-4.38      ½ff    °n ZbW^_W_U^W`W[_ZW^SZ_X[^ %470-:/.8385./42./42.

393.70.425430398  04890473..43897:. 90%850710.39074://0..796:.2.-0.0398   –°f¯f–  % %   % % % %  49480.98441.8437.9438419:8343902489831.4011.0398 3282:..7  ./:.791.9.3933 .32.3/70.943 8259702489419024:9 -09907 6:.902.2.4011..0/013943:39 089.0.80570./4182507434:9.3/0/4349480.078.3890.970.

.

3-00.7.\` S`[Z %.9434190805./4150714723.90/24706:.  %820.0883 ..93890.:83..89079.90/ % 8:.....389..3.35071472.%430.03.8070841%8 %01472:.0 70574..:.% 0.3/ 903430.88%  %0 7.8904303893 .830 5.02502039.3.M.74. 7:31.4:234190-4.

:.

381472 % 9497.7/%97.4792843041901.9439490% 472470/09.3814728..7441708/:..70904830%7.4011.9439070.% %0 /203843. 848.70 5039417084:7.0/31472. 708/:.70..:808908.0398   . ..394.08.4011.4/0.89089343%8 893 .48301:3.08  044:13/.04/3..0/2.943893.-4.8907%-:91742%8      _U^W`W[_ZW^SZ_X[^SZVaSZ`_S`[Z %0/.:08 ' 394.943   203843..3.039894.-04390390730980090#084:7.2.831706:03.08.7441 .3/3880..891.84793974/:..381472.47.80941.

 147.25074 %0708/:..707057080390/:838.:08 .550/94...70:80/070  f¯½    ¾ ff ¾              .0.8./08417.:08.79 435489. %014431:7084890708:94190% ..

4011.9 %014437.70.39:/04190.293/.7198 .907.9.4011.82.58.0398 %0.425708843  40.4011.3-08003390/0.99080/4349.06:.079.0398/43 944-09907...4397-:902:.832.909.70 2:.07 .0398..@n €€n °¾    %0708:93%.9490 2.4257088-0. %0.39017894308 .:094.

..07.5 .943 %017897.0398 9070.84843. %070//49870570803990473..70..990.039 .:08 30.897.. 742901197.4011.:08 .0398 39017897.039 8 .190750714723.07.09007 47/07  1706:03.0.08.5 43901789%.0708/:.7./..43897:..4011.43897:. %. 70//498 :83434190.3/088 349.425708843 $3.930974:90.07.43897:.039705708039890.//0/94902..039 90.42082.0398 8:80/ 9470.//3.7/ 908:-806:0392574.//943.0398 .5848.4011.3/90-:03024.4011.90/308.:08.990.041.543.4809490 473.0398 /09.07.0.0780%4380.4011.14437.990-49942410.-0  %8890094.4011.389.48079490473.:08  .4011.3039707441..90/708/:.4011.:08 .020398-0.43897:..4011.. %. .5 %0 -:0307057080399070.3..20.708/:.33.8.//0/949070.

:880/..480 ..:089..3-0/8.398. .398.9438 2470.%.07 .4397-:9008831472..3.398..%43.399490 2..24:3941 6:..943-09003708/:.:08  40.079.89079.3980/-908.1.%/8..3/.079.748 1440/ -.08.84-025020390/-1789.1706:03.94394902.% %0 /203843.3349-0/8.20..0847..47/147/.-4.090.06:.3/ :806:.7/0/089574/:.0/09.0398..89 0909:.9-0.03!47 4:.9:.  ":.039098/./41708/:.425708843.4011.082.4257088-0 -:9.39.4308  5071472890%43708/:.8/087.84:808.4011.91: 034:..4011..43909 .90.80990.9434190473..398.8:89 .553.-0 3.90898...3..%43.4397.4480....4011.4011.3349-070.3 25020393.07.:083..7/3..943:80/  !:80890%9497..1.90398.7/0/ %88! .74 09907.4011..55742.3.0398.343 :314726:.0 90./0/-./3398..3-0..:08 9.7025479.%/70.943 :9.706:...0/0309.079.4770.4011.0398.0-499047439....38147250.:0   203843.:08394.:083890. /203843.2.341706:03...3980/907.425090/8.03988349.943-09003708/:.0439.398.9432094/-.%43.4:39 %88/430-50714723..9  .3.0790708:98-0 %82./.0398-09907 .70 6:.557457..0.041.0398 .9:314726:...:08 $0..089 ..907.8470/:..0398 90.4770.08439.:0 9082..-4.:0 %0 0798.80 3 .90/ 1.3/ ..39508412..41708/:..43897:.943 .6:.794438 901706:03.4:238174290708:9419017898905 %882:.4011.:8090..

70 97.4/3  3!90%.7097.3904907.701742 903/0503/039!74:5 %87010703..2 :8343909/.4/3 849070834300/947047/07 90%.03.%0%.4/3 %0.8 %48500/:590.4011..4/0.4011.:.0398.4/0/:837:3039 03.92.08419.898905 90%..479283.8-.0398.47/079414727:38410748.3/ .47928:80/94..80/4390.935439.4011.47928 -:94324/073./408349:807:303903.94324/08.47928 43 :8089014.08419.2..9438. 390.4/0:80/390.382990/3.33904//.779209.:.4/0390!7010703.1.4011.708:554798/1107039..59.4/0/94908970.08 9 8.89079.4/0.09574/:..7003.:.3/570/.9438 90 7.935439.3-003.33047/07   . :.08900896:. .382990/38..0398..-98407 9.9090%  3.0398 $43 90.7050714720/ 2:.425:9078 14..