Trading memory for speed in DES enciphering

Finally I got some time to post my two years old code.
Here I was trying to trade memory for speed.
It went our pretty well.
The code does DES enciphering in ECB mode of the arbitrary file.
Pity I still did not improve key scheduling though it always takes constant time for any input data.
However, I am glad the main enciphering function now does only:

void des_encipher(UI32 & data_left, UI32 &data_right, des_element * keys){
	des_block data;
	ip(data_left, data_right, data);

	des_element tmp;
	register UI64 result;

	for(int i=0; i<16; ++i){
		tmp = data.left;
		data.left = data.right;

		//XORing with key
		*(UI64*) &data.right.bytes[0] ^= *(UI64*) &keys++->bytes;
		//Applying S boxes
		result = CONST_S1[*(UI8*) data.right.bytes];
		result |= CONST_S2[*((UI8*) data.right.bytes+1)];
		result |= CONST_S3[*((UI8*) data.right.bytes+2)];
		result |= CONST_S4[*((UI8*) data.right.bytes+3)];
		result |= CONST_S5[*((UI8*) data.right.bytes+4)];
		result |= CONST_S6[*((UI8*) data.right.bytes+5)];
		result |= CONST_S7[*((UI8*) data.right.bytes+6)];
		result |= CONST_S8[*((UI8*) data.right.bytes+7)];

		result ^= *(UI64*) &tmp.bytes;
		*(UI64*) & data.right.bytes  = result;

	}

	result = CONST_FINAL_LEFT1[*(UI8*) data.right.bytes];
	result |= CONST_FINAL_LEFT2[*((UI8*) data.right.bytes+1)];
	result |= CONST_FINAL_LEFT3[*((UI8*) data.right.bytes+2)];
	result |= CONST_FINAL_LEFT4[*((UI8*) data.right.bytes+3)];
	result |= CONST_FINAL_LEFT5[*((UI8*) data.right.bytes+4)];
	result |= CONST_FINAL_LEFT6[*((UI8*) data.right.bytes+5)];
	result |= CONST_FINAL_LEFT7[*((UI8*) data.right.bytes+6)];
	result |= CONST_FINAL_LEFT8[*((UI8*) data.right.bytes+7)];

	result |= CONST_FINAL_RIGHT1[*(UI8*) data.left.bytes];
	result |= CONST_FINAL_RIGHT2[*((UI8*) data.left.bytes+1)];
	result |= CONST_FINAL_RIGHT3[*((UI8*) data.left.bytes+2)];
	result |= CONST_FINAL_RIGHT4[*((UI8*) data.left.bytes+3)];
	result |= CONST_FINAL_RIGHT5[*((UI8*) data.left.bytes+4)];
	result |= CONST_FINAL_RIGHT6[*((UI8*) data.left.bytes+5)];
	result |= CONST_FINAL_RIGHT7[*((UI8*) data.left.bytes+6)];
	result |= CONST_FINAL_RIGHT8[*((UI8*) data.left.bytes+7)];
	data_right = (UI32) result;
	data_left = (UI32) (result>>32);
}


In the DES source code VS7 solution there are two projects.
First is des enciphering example based on Eric Young’s libdes version 4.01.
I included this code for comparing performance of my implementation with a solid, stable code.
Currently libdes is used in OpenSSL and is available on most operating systems.
Another project does the same but uses my code for enciphering.
It also has some timing report to show how fast was the enciphering process.

Usage:

des.exe input_file output_file

Keys are hardcoded in both application just to be sure that resulting text is equal in both output files.
I am sure this implementation can be improved for at least 50% for 64 bit machines since it is possible to convert lookup tables further.
So on my Dell Inspiron 1.83Ghz Dual Core it takes 6 to 8 seconds to encode 84MB file.
Libdes does the same at 10% lower speed.

It was kind of fun for me since I did not expect any competitive result comparing with the public service library.
This implementation is just a startup point, since I know it can be improved significantly.
Feel free to test or change it. Any comments are welcome.

By the way, on small files libdes will be faster because key scheduling is not optimized in my version.
Since key scheduling takes constant time for any file I left this part of code to be improved later.

Leave a Reply