#_shellntel Blog
Posts
Luai - An AI Malware Agent

Luai - An AI Malware Agent

On the fly code generation and execution

Dylan Reuter
June 06, 2025

AI seems to be the hottest talk of the town in recent years. It seems like every industry is looking to integrate AI into their business in some way, shape, or form. In the security industry, lots of AI pentesting tools have started to emerge which offload some of the work previously done by a pentester to an LLM. This is nice in some regards; I would be happy to offload report writing to an LLM which would leave me more time for hacking. I believe that AI will be a great enhancement for both Red and Blue teams in the coming years as LLMs continue to mature and adaptation increases.

As a penetration tester, my primary focus is on the offensive, red team side of things. I wanted to create a malware agent that actively uses an LLM to accomplish its objective.

I created Luai as a proof of concept to explore this idea. The Luai agent is written in Rust and makes API calls to OpenAI, using their o1 model to generate Lua code to accomplish the task issued by the attacker. The generated Lua code gets immediately executed and the result is returned to the attacker server. The benefits of this are:

The binary is lightweight and clean since it contains no nefarious Lua code.
Lua code is generated on the fly by the LLM based on the command sent by the attacker.
The generated Lua code is executed by Rust and very difficult for AV / EDR to detect. (More on this below)

This concept of Rust and Lua is a continuation of previous research I did on embedding Lua into Rust. You can read that full blog post here. But the TL;DR is that Lua is an embeddable scripting language that has a powerful Foreign Function Interface (FFI) available via LuaJIT. The FFI library allows you to call external C functions and use C data structures from pure Lua. So we can essentially call any WinAPI function from Lua.

What’s great about this is we can then embed the Lua code into Rust and execute it. Rust has a fantastic crate for embedding Lua called mlua that supports LuaJIT. This makes implementation very straight forward and we don’t need to mess with including any external DLLs or manually calling Lua functions and managing return values.

Embedded Lua is also fantastic for AV / EDR evasion. When ever you want to use a C function in Lua, you define and call it like so:

ffi.cdef[[
typedef void* HWND;
typedef const char* LPCSTR

int MessageBoxA(HWND hWnd, LPCSTR lpText, LPCSTR lpCaption, int uType);
]]

ffi.C.MessageBoxA(nil, "Hello from Lua", "Hello World", 0)

The LuaJIT virtual machine is responsible for executing the code. First, it needs to look up the address of MessageBoxA. This should sound familiar to malware devs out there when trying to hide function calls. We define the function signature, then use GetProcAddress / GetModuleHandle or some sort of manual implementation to lookup the function address at runtime. Except with LuaJIT, the VM handles all this for us; and as you would expect, MessageBoxA will not appear in the PE’s Import Address Table.

The Lua VM also utilizes a virtual stack that is used to pass data between Rust and Lua. When embedding Lua into other languages such as C, you must manually push and pop values to the virtual stack. However, with the mlua crate for Rust, most of this is automatically handled for us.

All this abstraction with the Lua VM poses a challenge for AV and EDR to effectively monitor and detect.

The Luai project has two components. Luai, which is the AL malware agent, and luai_web, which is the web interface running on a remote server where the attacker can issue commands to the agent.

The agent will contact the server at random intervals for tasking. Once it receives a task, it will make a query to the LLM and get back a Chat Completion response containing the generated Lua necessary to complete the task it was given. The LLM’s system prompt instructs it to always return the result in a string variable that way it can be popped off Lua’s virtual stack and accessed in Rust.

Duration is the sleep duration in seconds, Instruction is the task received from the server, the generated Lua script is printed to console for verbosity.

Once the LLM has generated the Lua script, it is directly executed within Rust like so:

Instead of creating Luai as a true AI Agent, where it can generate the Lua script, execute it, and reflect on the results through the thought / action / observation lifecycle and make changes accordingly; I opted to only use AI only to generate the Lua script. This was done for a couple reasons.

First and foremost, it prevents data from the compromised machine from being sent to the LLM. If the AI Agent were to run the Lua script, it would have to review the results to see if it accomplished the task or not, and to decide if it needs to take further action. It’s a different story if you are using a local LLM hosted on your own infrastructure. But if you are using OpenAI or any other service provider, it’s best to not send that information back to the LLM.

Second, AI Agent frameworks for Rust and other compiled languages is lacking. I know you don’t neeed an AI Agent framework, but it definitely simplifies the process and reduces complexity.

One of a couple issues I discovered during development and testing was that AI is very ok at generating Lua. Particularly Lua which calls WinAPI’s. I think this is due to type definitions between C and Lua, which can be very finicky.

For example, in Lua you have a type void* and in the WinAPI you have PVOID. Now you would think you could just use void* anywhere that PVOID is required, but no. You must explicitly create the type in the Lua code or else your code will probably not run:

ffi.cdef[[
typedef void* PVOID;
...
]]

AI is not great at figuring this out, so I provided a “starter set” of type definitions in the system prompt. It definitely helps, but it’s not bulletproof.

This results in the generated Lua being somewhat “hit or miss” depending on the task it’s given. I.e.,it will throw an error when Rust tries to execute it. Since it’s not an AI Agent and the LLM cannot reflect on the error and refine the script, I created a workaround which is a function that recursively calls itself X number of times until the script runs successfully or the limit is reached. It’s a big improvement, and usually the LLM can usually generate a working script during one of the attempts. But again, it’s not bulletproof.

The other issue I encountered is OpenAI’s low threshold for “malicious requests” where the LLM will refuse a command. Grok on the other hand, I encountered no refusals and the LLM was more than happy to comply, however the quality of the generated Lua code was lacking compared to OpenAI’s o1 model, which I found to be the best despite the occasional refusals.

Rust and Lua is a powerful combination and I have created payloads leveraging this dangerous duo that I use in engagements with great success. Introducing AI into the mix and using an LLM to generate the the Lua code only adds to it’s potency. Despite some of the pitfalls, a good system prompt can go a long way, and I think as AI continues to mature, it’s capabilities will only get better.

Thanks for reading!

Projects:

Luai: https://github.com/djackreuter/luai

Luai_web: https://github.com/djackreuter/luai_web