Lossy Algorithms

Methods that achieve high compression by discarding less important information: Transform (DCT), Predictive, and Fractal coding.

The Philosophy of Imperfection: The Power of Lossy Compression

While lossless compression stands as a monument to perfect data fidelity, its sibling, , champions a different philosophy: pragmatism. Lossless algorithms treat every single bit of data as sacred, ensuring a perfect reconstruction at the cost of modest compression ratios. Lossy compression, in contrast, makes a bold assumption: not all data is created equal. It posits that for certain types of information, especially multimedia meant for human consumption, we can discard vast quantities of data with little to no perceptible impact on quality.

This deliberate sacrifice of perfect accuracy is what enables the modern digital world. It is the magic behind streaming a 4K movie over your internet connection, storing thousands of high-quality photos on your phone, and participating in a real-time video conference with someone across the globe. Lossy algorithms are not crude tools that randomly delete data; they are incredibly sophisticated systems, built on a deep understanding of human perception. They meticulously analyze data to determine what our eyes cannot see and what our ears cannot hear, and then they discard precisely that information. In this section, we will explore the three principal families of lossy compression techniques: transform coding, the engine of JPEG and MPEG; predictive coding, which guesses the future to save space; and fractal compression, a unique approach that finds self-similarity within data.

Transform Coding: Seeing Data in a New Light

Transform coding is arguably the most important and successful lossy compression technique ever devised. It forms the core of virtually all modern image, video, and audio compression standards, including JPEG, MPEG, and MP3.

The Core Idea: Changing the Domain

The fundamental insight behind transform coding is that representing data in terms of its raw values (e.g., pixel brightness, audio sample amplitude) is often inefficient. Instead, it is often more useful to represent the data in a different mathematical domain, such as a frequency domain.

A transform is a mathematical operation that converts a set of data from one representation to another without losing information in the process itself. The goal is to reach a new representation where the information is structured in a way that is much easier to compress. Specifically, transform coding aims for two goals:

  • Decorrelation: In most real-world data like images, adjacent pixels are highly correlated; a red pixel is likely to be next to another red pixel. The transform aims to remove these relationships, so that the new values (coefficients) are more independent of each other.
  • Energy Compaction: This is the key to its success. A good transform will concentrate most of the important information or "energy" of the signal into just a few values, called coefficients. The remaining majority of coefficients will have very small, often near-zero values, representing less significant details.

The Main Player: Discrete Cosine Transform (DCT)

The most popular transform used in lossy compression is the . In practice, algorithms like JPEG and MPEG apply the DCT not to the entire image at once, but to small, manageable blocks of pixels, typically 8x8 blocks.

The 2D DCT takes an 8x8 block of pixel values (a spatial domain representation) and converts it into an 8x8 block of frequency coefficients (a frequency domain representation).

  • The top-left coefficient, the DC Coefficient, represents the average brightness or color of the entire 8x8 block. It contains the most energy.
  • The other 63 coefficients are the AC Coefficients. They represent the details and textures within the block, arranged from the lowest frequencies (coarse details) near the DC coefficient to the highest frequencies (fine, sharp details) in the bottom-right corner.

8x8 DCT explorer

Keep more coefficients to see the reconstructed block converge back to the original pixels.

Energy captured
98.0%
Mean squared error
61.77
PSNR
30.2 dB
Original block
52
55
61
66
70
61
64
73
63
59
66
90
109
85
69
72
62
59
68
113
144
104
66
73
63
58
71
122
154
106
70
69
67
61
68
104
126
88
68
70
79
65
60
70
77
68
58
75
85
71
64
59
55
61
65
83
87
79
69
68
65
76
78
94
DCT coefficients
-414
-29
6
-46
-21
-62
25
-62
8
-49
11
12
77
8
55
-20
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
Rebuilt block
61
51
50
71
86
71
57
65
60
56
62
88
104
86
67
72
60
63
78
110
127
103
77
78
60
66
85
119
134
106
77
75
60
62
76
105
117
90
64
66
68
61
62
80
90
70
56
66
85
69
56
63
72
62
63
84
103
80
57
58
67
65
75
104

Coefficients are shown in zig-zag order (low frequencies first). High-frequency values disappear fastest when the slider is low.

Drag the slider to watch how keeping only a few zig-zag coefficients blurs fine detail but preserves most low-frequency energy.

Case Study: The JPEG Compression Pipeline

The JPEG standard for still image compression is the perfect practical example of the transform coding process.

  1. Step 1: Color Space Transformation & Chroma Subsampling: An image is first converted from the RGB color model (Red, Green, Blue) to YCbCr. The Y component represents luminance (brightness), and Cb and Cr represent chrominance (color difference). Because the human eye is far less sensitive to fine details in color than in brightness, the Cb and Cr components are often downsampled, for example, by storing only one color value for every 2x2 block of pixels. This is the first, highly effective lossy step.
  2. Step 2: Blocking and DCT: Each component (Y, Cb, Cr) is divided into 8x8 pixel blocks. The DCT is then applied to each block individually, transforming the 64 pixel values into 64 DCT coefficients.
  3. Step 3: Quantization - The Heart of the Process: This is where the magic and the "loss" truly happen. Each of the 64 DCT coefficients is divided by a corresponding value from a 64-element quantization table, and the result is rounded to the nearest integer.
    Quantized Coefficient=DCT CoefficientQuantization Value+0.5\text{Quantized Coefficient} = \lfloor \frac{\text{DCT Coefficient}}{\text{Quantization Value}} + 0.5 \rfloor
    The quantization values are small for the important low-frequency coefficients (preserving them) and much larger for the less important high-frequency coefficients (aggressively reducing them). This crucial step turns many small AC coefficients into zero and reduces the precision of the others, permanently discarding information. The "quality" setting in an image editor (e.g., from 1 to 100) directly controls the values in this table. A lower quality setting uses larger divisors, resulting in more data loss and a smaller file.
  4. Step 4: Entropy Coding: The block of 64 quantized coefficients is now highly compressible because it contains many zeros. This block is then compressed using a lossless stage, typically combining Run-Length Encoding to handle the long strings of zeros, and Huffman coding to assign short codes to the most frequent remaining coefficient values.

The decoder simply reverses this process: it reads the losslessly compressed data, performs de-quantization (multiplying by the same table values, though the original precision is lost), applies the Inverse DCT to get back a block of pixels, and reassembles the image.

Predictive Coding: Guessing What's Next

Predictive coding is another powerful lossy technique, particularly effective for data where consecutive values are strongly correlated, such as audio waveforms.

The Core Idea: Transmitting Only the Difference

Instead of encoding the full value of every sample in an audio signal, a predictive coder makes an educated guess about what the next sample's value will be, based on the preceding samples. It then calculates the difference, or prediction error, between its guess and the actual value. This error is usually a very small number. The clever part is that the coder then transmits only this small error value.

This process becomes lossy when the prediction error is quantized before being transmitted. By using a limited number of bits to represent the error, its precision is reduced. For example, very small errors might all be rounded to zero, effectively stating that the prediction was "good enough." The decoder follows the same prediction process, receives the quantized error, and adds it to its own prediction to reconstruct an approximation of the original sample.

Predictive coding (DPCM) explorer

Adjust the quantiser step and choose a predictor to see how the loop tracks a slowly changing signal.

Approx. bits/sample
7 bps
Max abs error
2.0
Final drift
-1.0
nOriginalPredictorErrorQuantisedReconstructed
01280128128128
113012824132
213513234136
314013644140
414214024144
514714434148
6146148-20148
715014824152
815415224156
915815624160
1016016000160
1116116010160
Original: 161
Reconstructed: 160

In DPCM the encoder and decoder share the same predictor. Only the quantised error is transmitted.

Explore how predictor choice and quantiser step trade accuracy for rate inside the classic DPCM feedback loop.

This technique, known as Differential Pulse-Code Modulation (DPCM), is extremely common in speech and audio compression codecs. It is effective because adjacent audio samples are almost always very similar in value, making the prediction error small and highly compressible.

Fractal Compression: The Search for Self-Similarity

Fractal compression is a fascinating and mathematically beautiful, albeit less common, approach to lossy image compression. It is built upon the observation that many natural images contain .

The Core Idea: Describing an Image with Itself

A fractal compression algorithm analyzes an image to find parts that look like scaled-down, transformed versions of other, larger parts of the same image. For example, a small cloud might look like a scaled-down version of a larger cloud elsewhere in the sky. A small patch of fern might resemble the overall structure of the entire frond.

The encoding process does not store pixel data. Instead, it creates a list of mathematical transformations. For each small region of the image (a "range block"), it finds a larger region (a "domain block") and a set of instructions (an affine transformation involving scaling, rotation, and brightness/contrast adjustments) that makes the domain block look like the range block. The compressed file is simply this collection of "copying instructions."

Decoding is an iterative process. It begins with an arbitrary image and repeatedly applies the stored transformations. The Iterated Function System (IFS) theorem guarantees that this process will converge, and after a handful of iterations, the target image emerges.

Strengths and Weaknesses

Fractal compression has some unique properties:

  • Very High Compression Ratios: It can achieve very high size reductions, especially on highly textured, natural images.
  • Resolution Independence: Because the file stores rules instead of pixels, the image can be decoded at any resolution, often generating plausible-looking details when zoomed in, avoiding the blockiness of an upscaled JPEG.
  • Extremely Slow Encoding: The process of searching for the best matching blocks is computationally exhaustive, making encoding incredibly slow.

Due to the very long encoding times and historical patent complexities, fractal compression has remained a niche technology and has not achieved the widespread adoption of transform-based methods like JPEG.

Conclusion: A Synergy of Techniques

Transform coding, predictive coding, and fractal compression represent the major philosophical approaches to lossy data reduction. Modern systems often combine these ideas. The most advanced video codecs, for example, are hybrids. They use transform coding (DCT) to compress the spatial information within a frame, but they also use a highly sophisticated form of predictive coding called motion compensation to exploit the temporal similarities between frames. It is this powerful synergy of techniques that allows for the incredible compression efficiencies we see today, enabling high-quality multimedia to be stored and streamed using a fraction of the data it would otherwise require. The choice and tuning of these algorithms always involves a delicate balance, trading bits for a level of quality that is ultimately judged not by a computer, but by our own eyes and ears.

    Lossy Algorithms | Teleinf Edu