Improving XOR Encryption in Zig

When encrypting shellcode, I prefer to use XOR over something like AES. This is partly due to XOR having lower entropy than AES, so a binary might look less malicious, but also because XOR decryption is easy to implement and can be done without needing to include any special crypto libraries.

The way XOR encryption is usually done is with a single byte “key”. This can be either a hexadecimal representation 0×41 or a char A . It’s simple and efficient, but it is susceptible to brute forcing. This can be improved by using a “rotating key” instead of a single character. This is a huge improvement against brute forcing and it is still simple to implement.

However, there is still one big problem. if you inspect the XOR encrypted data, you will see that you are actually leaking your key in the data!

This blog will go over how to implement a rotating key, and also how to prevent the decryption key from being leaked in the encrypted data. This will also serve as a first look at the Zig programming language. Zig has been gaining popularity within the past several months and has definitely peaked my interest. Playing with XOR encryption seems like a great way to take Zig out for a spin.

What is Zig?

According to ziglang.org:

Zig is a general-purpose programming language and toolchain for maintaining robust, optimal and reusable software.

Zig is a a relatively simple language and works wonderfully with existing C code. Zig comes with Clang embedded in the executable so you can use it as a C/C++ compiler with the same arguments you would use for Clang. You can also import and use symbols from C header files.

You can even convert existing C code to Zig with the translate-c command line option.

zig translate-c main.c > awesome.zig

Zig also supports compile-time code generation and lazy evaluation. Polymorphic malware anyone?!?!

Zig is also very fast. Just look at this mesmerizing GIF of 1 billion nested loop iterations comparing different programming languages. Now you shouldn’t be using nested loops in the first place, but if you do, Zig has you covered ;).

I could go on about Zig, ( just wait till you see how easy it is to call WinAPI’s ), but I’ll just leave it here for now.

After you install Zig, create a main.zig file with the following content:

const std = @import("std")

pub fn main() !void {
    const fileBuffer: []const u8 = @embedFile("shellcode.bin");

    std.debug.print("[+] File length: {d}\n", .{fileBuffer.len});

    var buffer: [fileBuffer.len]u8 = undefined;

    try singleKeyXOR(fileBuffer, &buffer);
}

We can use @embedFile to load the shellcode into a byte array and print the file length with std.debug.print from the standard library. We are also creating a buffer that will store the XOR encrypted shellcode. The shellcode we are using is generated from an open source C2.

Now you may look at this and wonder why there is no catch block after try. In Zig, try is just shorthand for this:

const number = parseU64(str, 10) catch |err| return err;

Our singleKeyXOR function will look like this:

pub fn singleKeyXOR(fileBuffer: []const u8, buffer: []u8) !void {
    const key: u8 = 'A';

    for (fileBuffer, 0..fileBuffer.len) |char, i| {
        buffer[i] = char ^ key;
    }

    const file = try std.fs.cwd().createFile("normal_xor.bin", .{ .read = true });

    defer file.close();

    try file.writeAll(buffer);
}

Our key will be the char A, we will iterate through the length of the data and XOR each byte with the key of A and add it to the empty buffer. We will then write the contents of the buffer to a file.

We can build and run this with:

zig run .\main.zig

If open the resulting binary file in VSCode, we can see all of the A characters.

Next we will implement a rotating key approach. Create another function rotatingKeyXOR.

pub fn rotatingKeyXOR(fileBuffer: []const u8, buffer: []u8) !void {
    
    const key: []const u8 = "testing";

    var j: usize = 0;
    for (fileBuffer, 0..fileBuffer.len) |char, i| {
        if (j == key.len) {
            j = 0;
        }
        buffer[i] = char ^ key[j];
        j += 1;
    }

    const file = try std.fs.cwd().createFile("rotatingkey_xor.bin", .{ .read = true });

    defer file.close();

    try file.writeAll(buffer);

}

Instead of using a single character key, A, we are using a string of characters. This means each byte of the shellcode will be encrypted with one character of the string. E.g., byte 1 is XOR’d with t, byte 2 is XOR’d with e, byte 3 is XOR’d with s and so on.

This is great because it makes the data wayyy harder to decrypt through brute force. However, we still have one problem that is immediately obvious if we inspect the resulting bin file. We are leaking the key!

This happens when there are null bytes in the shellcode. When we try to XOR a null byte, the result will be unchanged. For example if you XOR 0×00 with A, the result will still be A ( 0×41 ).

So why not just skip the null bytes?

That is the right idea, but it’s not quite that simple. Just like how XORing a null byte with A will return A; if the byte being XOR’d is the same as the key, the result will be a null byte.

0x41 ^ 0x00 = 0x41

0x41 ^ 0x41 = 0x00

So by skipping null bytes on the encryption side, you will break the decryption since you are introducing null bytes that were previously unaccounted for.

What we need to do is in the encryption function, make note of the indices where null bytes are found and skip them. This will leave you with an array of indices where the true null bytes are located. The decryption function will then need this array so that it knows which indices to skip.

The encryption function will look like this:

pub fn rotatingKeySkipNullXOR(fileBuffer: []const u8, buffer: []u8) !void {
    const key: []const u8 = "testing";

    const ArrayList = std.ArrayList;
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();

    var indicies = ArrayList(usize).init(allocator);

    defer indicies.deinit();

    var j: usize = 0;
    for (fileBuffer, 0..fileBuffer.len) |char, i| {
        if (char == 0x00) {
            try indicies.append(i);
            continue;
        }
        if (j == key.len) {
            j = 0;
        }
        buffer[i] = char ^ key[j];
        j += 1;
    }

    std.debug.print("Indicies: {d}\n", .{indicies.items});
    std.debug.print("Indicies length: {d} \n", .{indicies.items.len});

    const file = try std.fs.cwd().createFile("rotatingkey_skip_nullbytes_xor.bin", .{ .read = true });

    defer file.close();

    try file.writeAll(buffer);
}

We will allocate memory on the heap and use an ArrayList to store the indices. We are using an ArrayList here because it is an array that can change it's size. The call:

 defer indicies.deinit();

Is freeing the allocated memory once the function completes. We then check for null bytes as we loop through the shellcode and if one is found, append the index to the array and skip to the next iteration. We then print out the array of indices and write the shellcode to a file.

It should be noted that for brevity in the screenshot I swapped the C2 shellcode to clac64 shellcode generated with msfvenom. C2 shellcode generates an array with a length over 20,000.

Now for the end result (with C2 shellcode):

The shellcode is XOR encrypted with a rotating key, and we have accounted for null bytes to ensure we are not leaking the key. Excellent.

Now for the decryption function:

pub fn rotatingKeySkipNullDecrypt() !void {
    const fileBuffer: []const u8 = @embedFile("rotatingkey_skip_nullbytes_xor.bin");

    const key = "testing";
    const indices = [_]usize{ 7, 8, 9, 80, 81, 82, 206, 207, 208, 209, 210, 211, 212, 218, 219, 260, 295 };

    var dec_buffer: [fileBuffer.len]u8 = undefined;

    var j: usize = 0;
    var current_index: [*]const usize = &indicies;

    for (fileBuffer, 0..fileBuffer.len) |char, i| {
        if (current_index[0] == i) {
            current_index += 1;
            continue;
        }
        if (j == key.len) {
            j = 0;
        }
        dec_buffer[i] = char ^ key[j];
        j += 1;
    }

    const file = try std.fs.cwd().createFile("decrypted.bin", .{ .read = true });

    defer file.close();

    try file.writeAll(&dec_buffer);

}

This is very similar to the encryption function, however there are a couple changes. First, we add the array of indices returned from running the encryptor. Again, I’m using the indices from calc64 for the sake of brevity. We also create an empty buffer to store the decrypted shellcode. Then, as we loop through the bytes, we check if i is equal to the current index to be skipped. If so, then we increment to the next index in the array and skip to the next iteration.

I want to take a sec to explain this pointer arithmetic here because it’s really cool.

const indices = [_]usize{ 7, 8, 9, 80, 81, 82, 206, 207, 208, 209, 210, 211, 212, 218, 219, 260, 295 };

var current_index: [*]const usize = &indices;

current_index is a pointer to the first element of the indices array. So current_index[0] is equal to 7. So in the for loop we have the following if statement:

if (current_index[0] == i) {
    current_index += 1;
    continue;
}

So current_index[0] will be equal to 7 which is our first element in the array. Once i gets to that iteration, we increment the pointer by 1 so that current_index[0] points to the next item in the array.

By doing it this way, we don’t have to keep track of another variable and counter to keep track of our position in the array. We know that current_index[0] will always be the next element we are looking for.

I hope you learned something interesting about Zig and XOR. Zig looks like a pretty cool language and I look forward to creating more malware focused Zig content in the future.

Full source code is available here: