JPEG color downsampling can be implemented in a simple but functional way without a lot of code. The basic idea is that your eyes are less sensitive to color changes compared to brightness changes, so a JPEG file can be much smaller by discarding some color information. There are many ways to select color information, but JPEG images tend to use 4 options: none, 1/2 horizontal, 1/2 vertical, and 1/2 horizontal + vertical. There are additional TIFF / EXIF options, such as the "center point" of the sub-color, but for simplicity we will use the average of the summation method.
In the simplest case (without subsampling), each MCU (minimum coded block) is an 8x8 pixel block consisting of 3 components - Y, Cb, Cr. The image is processed in blocks of 8x8 pixels, where 3 color components are separated, transmitted via DCT conversion and written to the file in the order (Y, Cb, Cr). In all cases of subsampling, DCT blocks always consist of 8x8 coefficients or 64 values, but the meaning of these values depends on the color subsampling.
The following simplest case is selected in one dimension (horizontal or vertical). For this example, use 1/2 horizontal subsampling. The MCU now has a resolution of 16 pixels wide and 8 pixels. The compressed output of each MCU will now be 4 8x8 DCT blocks (Y0, Y1, Cb, Cr). Y0 represents the brightness values of the left 8x8-pixel block, and Y1 represents the brightness values of the right 8x8 pixel block. The Cb and Cr values are each 8x8 blocks based on the average of the horizontal pairs of pixels. I could not find any good images to insert here, but some pseudo codes may come in handy.
(update: image, which may be a subsample :) 
Here is a simple loop that performs color sub-sampling in our 1/2 horizontal case:
unsigned char ucCb[8][8], ucCr[8][8]; int x, y; for (y=0; y<8; y++) { for (x=0; x<8; x++) { ucCb[y][x] = (srcCb[y][x*2] + srcCb[y][(x*2)+1] + 1)/2; // average each horiz pair ucCr[y][x] = (srcCr[y][x*2] + srcCr[y][(x*2)+1] + 1)/2; } // for x } // for y
As you can see, there are not many. Each pair of Cb and Cr pixels of the original image is averaged horizontally to form a new Cb / Cr pixel. The DCT is then converted, zigzagged, and encoded in the same form as always.
Finally, for the 2x2 subsampling case, the MCU is now 16x16 pixels, and the recorded DCTs will be Y0, Y1, Y2, Y3, Cb, Cr. Where Y0 represents the upper left pixels of 8x8 brightness, Y1 represents the upper right, Y2 represents the lower left, and Y3 represents the lower right. The values of Cb and Cr in this case represent 4 source pixels (2x2) that were averaged together. Just in case, when you are interested, color values are averaged together in the YCbCr color space. If you average pixels together in the RGB color space, this will not work correctly.
FYI - Adobe supports JPEG images in the RGB color space (instead of YCbCr). These images cannot use color subsampling because R, G, and B are of equal importance, and subsampling in this color space will result in significantly worse visual artifacts.