There are many ways to solve this issue, which uses some arithmetic to convert from ascii 0-9 and af (or AF) character ranges to binary. I wanted to find a solution that uses only a lookup table and a checklist, instead of a solution that uses arithmetic instead. Oddly enough, none of the above answers performs a purely arithmetic solution, and some answers even suggest that "conversion to binary" means converting the characters "0" and "1" to an ascii string.
Let's make some settings first. First, we want to have all the test data in memory so that we avoid the I / O disk that affects the test. This is how I create a header with an array of characters "testdata" of 104857600 bytes, approximately 105 MB. Since the question was how to convert files, our implementation should be fast on big data.
$ { printf "char *testdata =\""; cat /dev/urandom \ | tr -d -c "0123456789abcdefABCDEF" \ | dd count=100 iflag=fullblock bs=1M; printf "\";\n" } > testdata.h
Next, we create lookup tables. I see two possible ways to solve this problem using the lookup table. Either the lookup table displays single ascii hexadecimal characters in half bytes or maps two hexadecimal characters in full bytes. In the first case, the lookup table should contain 256 entries. In the latter case, the lookup table should have 256 * 256 = 65536 records. We can reduce the size of the latter by realizing that the first bit of the first byte will never be used. Therefore, we only need a lookup table of 128 * 256 = 32768 entries. Since this solution also requires an additional calculation step (using a bitmask), we will compare both. As a result, we get the following test cases:
- arithmetic solution
- 256 entries search table
- Search table 32768 records
- 65536 record search table
The first lookup table is easily generated using some python:
#!/usr/bin/env python import sys,struct sys.stdout.write("unsigned char base16_decoding_table1[256] = {\n") for i in xrange(256): try: j = str(int(chr(i), 16)) except: j = '0' sys.stdout.write(j+',') sys.stdout.write("};\n") sys.stdout.write("\n") l = 128*256*["0"] for a in ['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','A','B','C','D','E','F']: for b in ['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','A','B','C','D','E','F']: l[struct.unpack("<H", a+b)[0]] = str(int(a+b, 16)) line = "unsigned char base16_decoding_table2[%d] = {"%(128*256) for e in l: line += e+"," if len(line) > 70: sys.stdout.write(line+"\n") line = "" sys.stdout.write(line+"};\n") sys.stdout.write("\n") l = 256*256*["0"] for a in ['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','A','B','C','D','E','F']: for b in ['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','A','B','C','D','E','F']: l[struct.unpack("<H", a+b)[0]] = str(int(a+b, 16)) line = "unsigned char base16_decoding_table3[%d] = {"%(256*256) for e in l: line += e+"," if len(line) > 70: sys.stdout.write(line+"\n") line = "" sys.stdout.write(line+"};\n")
And then:
python gen.py > base16_decoding_table.h
Now we can write C code for testing.
#include <stdio.h> #include <time.h> #include <inttypes.h> #include "testdata.h" #include "base16_decoding_table.h" #define TESTDATALEN 104857600 /* the resulting binary string is half the size of the input hex string * because every two hex characters map to one byte */ unsigned char result[TESTDATALEN/2]; void test1() { size_t i; char cur; unsigned char val; for (i = 0; i < TESTDATALEN; i++) { cur = testdata[i]; if (cur >= 97) { val = cur - 97 + 10; } else if (cur >= 65) { val = cur - 65 + 10; } else { val = cur - 48; } /* even characters are the first half, odd characters the second half * of the current output byte */ if (i%2 == 0) { result[i/2] = val << 4; } else { result[i/2] |= val; } } } void test2() { size_t i; char cur; unsigned char val; for (i = 0; i < TESTDATALEN; i++) { cur = testdata[i]; val = base16_decoding_table1[(int)cur]; /* even characters are the first half, odd characters the second half * of the current output byte */ if (i%2 == 0) { result[i/2] = val << 4; } else { result[i/2] |= val; } } } void test3() { size_t i; uint16_t *cur; unsigned char val; for (i = 0; i < TESTDATALEN; i+=2) { cur = (uint16_t*)(testdata+i); // apply bitmask to make sure that the first bit is zero val = base16_decoding_table2[*cur & 0x7fff]; result[i/2] = val; } } void test4() { size_t i; uint16_t *cur; unsigned char val; for (i = 0; i < TESTDATALEN; i+=2) { cur = (uint16_t*)(testdata+i); val = base16_decoding_table3[*cur]; result[i/2] = val; } } #define NUMTESTS 1000 int main() { struct timespec before, after; unsigned long long checksum; int i; double elapsed; clock_gettime(CLOCK_MONOTONIC, &before); for (i = 0; i < NUMTESTS; i++) { test1(); } clock_gettime(CLOCK_MONOTONIC, &after); checksum = 0; for (i = 0; i < TESTDATALEN/2; i++) { checksum += result[i]; } printf("checksum: %llu\n", checksum); elapsed = difftime(after.tv_sec, before.tv_sec) + (after.tv_nsec - before.tv_nsec)/1.0e9; printf("arithmetic solution took %f seconds\n", elapsed); clock_gettime(CLOCK_MONOTONIC, &before); for (i = 0; i < NUMTESTS; i++) { test2(); } clock_gettime(CLOCK_MONOTONIC, &after); checksum = 0; for (i = 0; i < TESTDATALEN/2; i++) { checksum += result[i]; } printf("checksum: %llu\n", checksum); elapsed = difftime(after.tv_sec, before.tv_sec) + (after.tv_nsec - before.tv_nsec)/1.0e9; printf("256 entries table took %f seconds\n", elapsed); clock_gettime(CLOCK_MONOTONIC, &before); for (i = 0; i < NUMTESTS; i++) { test3(); } clock_gettime(CLOCK_MONOTONIC, &after); checksum = 0; for (i = 0; i < TESTDATALEN/2; i++) { checksum += result[i]; } printf("checksum: %llu\n", checksum); elapsed = difftime(after.tv_sec, before.tv_sec) + (after.tv_nsec - before.tv_nsec)/1.0e9; printf("32768 entries table took %f seconds\n", elapsed); clock_gettime(CLOCK_MONOTONIC, &before); for (i = 0; i < NUMTESTS; i++) { test4(); } clock_gettime(CLOCK_MONOTONIC, &after); checksum = 0; for (i = 0; i < TESTDATALEN/2; i++) { checksum += result[i]; } printf("checksum: %llu\n", checksum); elapsed = difftime(after.tv_sec, before.tv_sec) + (after.tv_nsec - before.tv_nsec)/1.0e9; printf("65536 entries table took %f seconds\n", elapsed); return 0; }
Let's compile the thing:
$ gcc -O3 -g -Wall -Wextra test.c
And run it:
$ ./a.out
Result:
- arithmetic solution: 437.17 s
- 256 entries search table: 117.80 s
- Search table 32768 records: 52.33 s
- Search table 65,536 records: 44.66 s
We can conclude that lookup tables beat up arithmetic solutions at any time, and that memory loss for large lookup tables can cost extra runtime.