AES Encryption 3 - Rust

Published on 2023-09-05 by Kartikay Bagla


The python prototype for AES took 12 seconds to encrypt a 120KB file (~0.1MB/s). I am throughly unimpressed by that and its time to try to write it in rust. Spoilers, but it was able to encrypt a 1.09GB file in 20 seconds (>50MB/s). That is more than 500x faster. And that's just on a single thread.

This is part 3 of the series. Read part 2 here.

The Rust Journey Begins

I have gone through the entire the rust book once or twice but have very little hands on experience with coding in rust other than just modifying examples or writing that grep example in book. This was going to be my first project in rust where I wasn't following a tutorial but doing my own stuff.

First Steps

I ran cargo init and then hit a wall. The last time I read the rust book was 6 months ago. Folder structure, organization etc was just lost on me. Initially I decided to first implement add_round_key in the main function directly to flex the my smoothbrain a bit.

The Big Brain Moment

As I began to write the code, a small wrinkle appeared in my brain, the twinkle of an idea.
I had been using 4x4 arrays of 1 byte (equivalent to an 8bit unsigned integer) each this whole time. What if I used 4x1 array of 4 bytes each. This way each byte could be a 32bit unsigned integer and operations would probably be faster. Or what if I used just a single 128 bit unsigned integer? Gone would be the costly for loops where each element in the array was updated sequentially. All ops happen on a single block of memory and that's it.

Wary from things going wrong in python, I first wrote the code for 4x4 u8 with tests and then wrote code for 4x1 u32 and for u128. The implementation for add_round_key was simple, just XOR everything. Now I needed to check if my theory was correct. So I added a benchmarking crate and ran it. Show numbers here, it seems that u8:u32:u128 is about 1:2:4 in terms of speed with u128 being the fastest. My theory was correct. However I knew that all the other operations are not as simple as this one so they might vary in their speeds. So I just decided to write all the operations for u8, u32 and u128.

Discuss Implementation here

Rust Takeaways

Stack good, Heap bad

Rust is nice as long as everything is on the stack. If you look at the code till encrypt_block, it is clean and sexy because all the inputs, intermediaries and outputs have a we'll defined structure. But as soon as user input came in, like file size not being divisible by 16 etc, code quality went to shit with references/pointers (I still don't know the difference between them) and cloning of strings etc. Contrast this with python where there's no concept of heap and stack and code quality is always shit.

I AM SPEEEEEEED

The release build managed to encrypt a 1.09GB file in 20 seconds. I was expecting and improvement compared to python based on the benchmark numbers but not this much of an improvement.

Next steps for the Project

Finish off other features show list from README.

Next steps for me

I wrote this program in python which goes through a video and selects the dominant colour from each frame and draws a line for it to get a colour palette (or feel or whatever) for the entire video. I distinctly remember it being slow in python to the point where I first had to resize the video on ffmpeg to about 360p and then feed it. And even then it took a while.

Now that this project was a success, I guess it's time to rewrite that in rust as well.