Chroma subsampling algorithm for jpeg

I am trying to write a jpeg encoder and stumble on creating algorithms that collect the corresponding Y, Cb and Cr color components in order to move on to the method that performs the conversion.

As I understand it, for the four most common sample options, the samples are set as follows (I could be here):

  • 4: 4: 4 - an 8x8 pixel MCU with Y, Cb and Cr represented in each pixel.
  • 4: 2: 2 - MCU block of 16x8 pixels with Y in each pixel and Cb, Cr every two pixels.
  • 4: 2: 0 - 16x16 pixel MCU with Y every two pixels and Cb, Cr every four

Here's the most explicit description of the loot I've found so far, described here

I don’t understand how to assemble these components in the correct order to pass as an 8x8 block for conversion and quantization.

Can someone write an example (pseudo-code will be fine, I'm sure C # is even better) how to group bytes for conversion?

I will include the current, incorrect code that I am running.

/// <summary> /// Writes the Scan header structure /// </summary> /// <param name="image">The image to encode from.</param> /// <param name="writer">The writer to write to the stream.</param> private void WriteStartOfScan(ImageBase image, EndianBinaryWriter writer) { // Marker writer.Write(new[] { JpegConstants.Markers.XFF, JpegConstants.Markers.SOS }); // Length (high byte, low byte), must be 6 + 2 * (number of components in scan) writer.Write((short)0xc); // 12 byte[] sos = { 3, // Number of components in a scan, usually 1 or 3 1, // Component Id Y 0, // DC/AC Huffman table 2, // Component Id Cb 0x11, // DC/AC Huffman table 3, // Component Id Cr 0x11, // DC/AC Huffman table 0, // Ss - Start of spectral selection. 0x3f, // Se - End of spectral selection. 0 // Ah + Ah (Successive approximation bit position high + low) }; writer.Write(sos); // Compress and write the pixels // Buffers for each Y'Cb Cr component float[] yU = new float[64]; float[] cbU = new float[64]; float[] crU = new float[64]; // The descrete cosine values for each componant. int[] dcValues = new int[3]; // TODO: Why null? this.huffmanTable = new HuffmanTable(null); // TODO: Color output is incorrect after this point. // I think I've got my looping all wrong. // For each row for (int y = 0; y < image.Height; y += 8) { // For each column for (int x = 0; x < image.Width; x += 8) { // Convert the 8x8 array to YCbCr this.RgbToYcbCr(image, yU, cbU, crU, x, y); // For each component this.CompressPixels(yU, 0, writer, dcValues); this.CompressPixels(cbU, 1, writer, dcValues); this.CompressPixels(crU, 2, writer, dcValues); } } this.huffmanTable.FlushBuffer(writer); } /// <summary> /// Converts the pixel block from the RGBA colorspace to YCbCr. /// </summary> /// <param name="image"></param> /// <param name="yComponant">The container to house the Y' luma componant within the block.</param> /// <param name="cbComponant">The container to house the Cb chroma componant within the block.</param> /// <param name="crComponant">The container to house the Cr chroma componant within the block.</param> /// <param name="x">The x-position within the image.</param> /// <param name="y">The y-position within the image.</param> private void RgbToYcbCr(ImageBase image, float[] yComponant, float[] cbComponant, float[] crComponant, int x, int y) { int height = image.Height; int width = image.Width; for (int a = 0; a < 8; a++) { // Complete with the remaining right and bottom edge pixels. int py = y + a; if (py >= height) { py = height - 1; } for (int b = 0; b < 8; b++) { int px = x + b; if (px >= width) { px = width - 1; } YCbCr color = image[px, py]; int index = a * 8 + b; yComponant[index] = color.Y; cbComponant[index] = color.Cb; crComponant[index] = color.Cr; } } } /// <summary> /// Compress and encodes the pixels. /// </summary> /// <param name="componantValues">The current color component values within the image block.</param> /// <param name="componantIndex">The componant index.</param> /// <param name="writer">The writer.</param> /// <param name="dcValues">The descrete cosine values for each componant</param> private void CompressPixels(float[] componantValues, int componantIndex, EndianBinaryWriter writer, int[] dcValues) { // TODO: This should be an option. byte[] horizontalFactors = JpegConstants.ChromaFourTwoZeroHorizontal; byte[] verticalFactors = JpegConstants.ChromaFourTwoZeroVertical; byte[] quantizationTableNumber = { 0, 1, 1 }; int[] dcTableNumber = { 0, 1, 1 }; int[] acTableNumber = { 0, 1, 1 }; for (int y = 0; y < verticalFactors[componantIndex]; y++) { for (int x = 0; x < horizontalFactors[componantIndex]; x++) { // TODO: This can probably be combined reducing the array allocation. float[] dct = this.fdct.FastFDCT(componantValues); int[] quantizedDct = this.fdct.QuantizeBlock(dct, quantizationTableNumber[componantIndex]); this.huffmanTable.HuffmanBlockEncoder(writer, quantizedDct, dcValues[componantIndex], dcTableNumber[componantIndex], acTableNumber[componantIndex]); dcValues[componantIndex] = quantizedDct[0]; } } } 

This code is part of the open source library I write on Github

+5
source share
1 answer

JPEG color downsampling can be implemented in a simple but functional way without a lot of code. The basic idea is that your eyes are less sensitive to color changes compared to brightness changes, so a JPEG file can be much smaller by discarding some color information. There are many ways to select color information, but JPEG images tend to use 4 options: none, 1/2 horizontal, 1/2 vertical, and 1/2 horizontal + vertical. There are additional TIFF / EXIF ​​options, such as the "center point" of the sub-color, but for simplicity we will use the average of the summation method.

In the simplest case (without subsampling), each MCU (minimum coded block) is an 8x8 pixel block consisting of 3 components - Y, Cb, Cr. The image is processed in blocks of 8x8 pixels, where 3 color components are separated, transmitted via DCT conversion and written to the file in the order (Y, Cb, Cr). In all cases of subsampling, DCT blocks always consist of 8x8 coefficients or 64 values, but the meaning of these values ​​depends on the color subsampling.

The following simplest case is selected in one dimension (horizontal or vertical). For this example, use 1/2 horizontal subsampling. The MCU now has a resolution of 16 pixels wide and 8 pixels. The compressed output of each MCU will now be 4 8x8 DCT blocks (Y0, Y1, Cb, Cr). Y0 represents the brightness values ​​of the left 8x8-pixel block, and Y1 represents the brightness values ​​of the right 8x8 pixel block. The Cb and Cr values ​​are each 8x8 blocks based on the average of the horizontal pairs of pixels. I could not find any good images to insert here, but some pseudo codes may come in handy.

(update: image, which may be a subsample :) enter image description here

Here is a simple loop that performs color sub-sampling in our 1/2 horizontal case:

 unsigned char ucCb[8][8], ucCr[8][8]; int x, y; for (y=0; y<8; y++) { for (x=0; x<8; x++) { ucCb[y][x] = (srcCb[y][x*2] + srcCb[y][(x*2)+1] + 1)/2; // average each horiz pair ucCr[y][x] = (srcCr[y][x*2] + srcCr[y][(x*2)+1] + 1)/2; } // for x } // for y 

As you can see, there are not many. Each pair of Cb and Cr pixels of the original image is averaged horizontally to form a new Cb / Cr pixel. The DCT is then converted, zigzagged, and encoded in the same form as always.

Finally, for the 2x2 subsampling case, the MCU is now 16x16 pixels, and the recorded DCTs will be Y0, Y1, Y2, Y3, Cb, Cr. Where Y0 represents the upper left pixels of 8x8 brightness, Y1 represents the upper right, Y2 represents the lower left, and Y3 represents the lower right. The values ​​of Cb and Cr in this case represent 4 source pixels (2x2) that were averaged together. Just in case, when you are interested, color values ​​are averaged together in the YCbCr color space. If you average pixels together in the RGB color space, this will not work correctly.

FYI - Adobe supports JPEG images in the RGB color space (instead of YCbCr). These images cannot use color subsampling because R, G, and B are of equal importance, and subsampling in this color space will result in significantly worse visual artifacts.

+2
source

Source: https://habr.com/ru/post/1243364/


All Articles