As Lua is slowly but surely becoming the world’s foremost solution for embed scripting languages in gaming engines and production software alike, protecting the innards of the language is increasingly becoming important as well. Leaving the language fully unprotected opens a vast number of weaknesses, ranging from mere modifications to the software’s logic (such as modifying the interface’s look and feel) to cracking the program’s license validation algorithms. In the case of the video gaming world, abusing poor Lua security leads to cheating and changing the game’s rules, which is an evidently big no-no in today’s popular MMOs.
The matter of the fact is that Lua itself is insecure by design. Compare Lua to, say, C or C++. The latter are languages that usually compiles to a binary format normally difficult to reverse by most engineers, which under proper conditions leave no debugging information or readable information about the code whatsoever, whereas Lua compiles to its own bytecode format which includes:
- The name of each and every variable used in the script.
- An entire package of debug information necessary for debugging and error messaging, which can reveal bits of the script’s own source code in certain cases.
- A list of the constants used in the script.
Combine the above with Lua’s incredibly simple instruction set and you got a language that is easy to read by a machine and by a human being. Fact is, Lua 5.1’s power stems from its virtual machine, powered by an instruction set constituted solely of 36 instructions, leaving practically no room for complex optimization or compilation. Due to that simplicity, it’s very easy to automatically convert a set of Lua instructions back to its original source code. When compared to the complexity and optimizability of x86, Lua’s instruction set lacks a grand amount of features. This doesn’t make Lua any less powerful though, it just means that Lua’s insides are very simple and thus easy to learn by beginners new to programming (or, well, beginners new to looking at how a runtime language works).
While its simplicity is a good thing for hobbyists, it’s certainly not for the commercial scene. If you wish to build commercial software in Lua, then without modifying the core of Lua itself to include some additional security you can kiss goodbye your digital rights management. Let’s list a number of reasons why the commercial scene despises Lua for programming commercial software:
- Lua is incredibly simple to reverse. If you want to protect your application from pirates, you must make pirates unable to interface with your Lua runtime in any way, which is impossible to do out of the box as Lua does not offer any API protection. In fact, poor program design combined with Lua could lead to your application’s source code being revealed to a competent reverse engineer.
- Lua is a runtime language. While it is certainly possible to build your entire software with Lua (especially with implementations of Lua such as LuaJIT), it cannot compile to a binary format by itself. You’ll need to build a bootstrapper for your Lua source code or bytecode if you wish to write software in the language. This alone can create a number of security holes provided your bootstrapper is not secure.
- Lua functions are objects, thus hardly protectable. Say that it is in your interests to redirect Function A to your own Function B. In most instruction sets, subroutines and their instructions are stored directly in memory, making it complex for malicious software to overwrite the function.
- Most of the time, reverse engineers will write code that will hook the function instead of overwriting it. That is, they will place a jump instruction at the very beginning of the function (detour) or somewhere within the function (mid-level hook) that will jump to the reverser’s function, acting as a “replacement” even though the entirety of the function wasn’t overwritten.
- Of course, that’s not the only way to hook functions. For example, if the program makes extensive use of C++ virtual classes with a compiler such as MSVC then a reverse engineer can hook individual methods within what is called the virtual method table in order to redirect method calls. That’s one example many others.
- In Lua’s case, functions are stored as movable objects. In other languages, as I said above, you usually need to hook a function to redirect its flow. However, since every Lua object is overwrittable, you can simply invoke code such as A = B (or use Lua’s own language API, with functions such as lua_xmove) to overwrite values and, consequently, entire functions. If the program stores a function in value “ABC”, then in order to replace the function all the reverse engineer needs to do is execute a script setting ABC to its own value, thus overriding and replacing the function and changing the software’s behavior.
- Lua includes its compiler in the runtime. Even if you make your software incredibly secure, some programmer may just invoke Lua’s own vulnerable API to run his own unsigned Lua code. After all, all it needs to run that code is a call to loadstring and pcall.
Considering that Lua does not offer its own protection, it is up to individuals to implement such security. Thankfully, Lua’s relatively simple internals makes it easy for a developer to secure the runtime. However, Lua’s own API will always be a weakness vector unless you remove it. It offers a complete interface to the language, which means that even if you were to, say, remove the compiler from the runtime, you can still use the internal API to achieve results equal to a Lua script. And even then, removing the compiler in itself does not mean you cannot execute Lua. After all, you can simply download the Lua source code and compile your own bytecode before feeding it to the client’s runtime of the language, rendering code removals useless.
Truly securing Lua: implementing an obfuscated instruction set
If a company offering commercial software truly wishes to secure its Lua runtime, then the only foreseeable and secure way to get those pesky hackers on their toes is to change the Lua instruction set and internal structures. Changing both of those elements fundamentally changes the internals of the language to the point the original runtime cannot interface correctly with it. And guess what? That’s what imagination company Roblox did to secure its platform.
Roblox’s anti-cheat solution is severely underrated. In usermode, it’s completely capable of detecting memory editing, page permission changes, virtual method table hooks, binary injection, illegal software (such as Cheat Engine), foreign VEH handlers, foreign SEH handlers, a variety of usermode+kernelmode debuggers and much more. You would classically expect those features from a kernelmode anti-cheat solution such as EAC or BattlEye, but Roblox managed to do it all from usermode. In addition to all this, they severely modified their Lua runtime. In fact, they replaced the entire bytecode format with their own and changed the instruction set to a set of randomized values, which is obfuscated at compile time and deobfuscated at runtime.
Just like Counter-Strike: Global Offensive, Fortnite, PU:BG and a variety of online games, Roblox has its own cheating scene. Now, unlike those aforementioned games, knowledge about Roblox’s internals and its anti-cheat is very scarce. This is partly because most of the cheats for the game were written by a couple of young organized programmers who keeps their sacred knowledge to themselves, mainly for commercial purposes. While I can personally admit that reversing Roblox and producing cheats for it is very hard, which can lead to people not wanting to publicly release their research (especially if they want to profit from it after all), it incredibly sucks for newcomers that wishes to reverse Roblox and develop cheats for it. In fact, those veteran developers are not very welcoming either…
Putting anecdotal experience aside – the point is that Roblox did a lot to secure their platform. Today, I’m going to demonstrate what Roblox did to their Lua runtime, and how other commercial entities wishing to secure their own can reproduce such changes to ensure their own implementation of Lua is protected and partially safe from hackers.
The compilation scheme, bytecode format and instruction set of Roblox’s Lua
In untouched, fresh-out-of-the-tar-gz Lua, the compiler is included. Usually, in order to run Lua scripts, you need to compile it using the API and execute it, which is done through functions such as luaL_loadstring and lua_pcall. Roblox, not wanting hackers to run their own unsigned Lua scripts on their platform, stripped the runtime of its compiler and moved it to the server.
You see, Roblox is a platform that allows individuals to build their own games in Lua using an application called Roblox Studio. In order to play those games, you need to go on their website, choose a game from the game list and press the big “Play” button. Doing so launches the Roblox client, which connects to a remote server (called the RCCService, “Roblox Compute Cloud Service”) and allows the player to play the game. It is during this connection that the scripts are compiled on the server and then sent to the client in a bytecode format, which is then unserialized and ran whenever the client requests it. The bytecode format in question differs greatly from the original Lua format, not only in layout but also in content: certain things are not excluded, which reduces the over-all size of the bytecode while also making it harder to reverse.
The bytecode format, along with the instruction set, is heavily obfuscated and encrypted. Using a number of compression algorithms and weird bitwise operations, the bytecode format and the custom instruction set is incredibly difficult to understand (I mean, they didn’t add or remove any instructions, but they did change the behavior of some while moving certain instructions around and obfuscating them behind bitwise obfuscation). Case in point, it would require a degree in computer sciences to fully understand the deobfuscation algorithms for the instruction set… for those hackers I mentioned earlier, it did.
The only exception to the above is code that MUST be on the client at all times, and mustn’t change. In Roblox, CoreScripts are trusted Lua scripts that operates vital user interfaces, and those scripts, while being in the encrypted format, are not downloaded from the server. Instead, they are embed directly into the client and executed at runtime.
Changes to the Lua internal structures
When you compile a structure with MSVC (or any C/C++ compiler for that matter), the layout of the structure in memory is usually equal to the layout specified in the source code, discarding alignment.
const char* a_c_string;
DW an_integer ?
DD a_double ?
DW a_c_string ?
When you compile the Lua runtime, it will compile code based on the structure definitions of its own code… obviously. Leaving those structure definitions as they are would open additional security weaknesses, considering reverse engineers can simply take a peek at the Lua source code and learn how the program’s Lua data is structured internally. Roblox dismisses this vulnerability by simply changing the structures’ fields around, requiring hackers and reverse engineers to figure out how data is structured the Roblox-way, which is a pain and time-consuming.
In addition to all of this, Roblox also obfuscates pointers using simple addition/subtraction arithmetic. This is obviously not difficult to figure out, but can be nonetheless surprising to those reversing Roblox’s platform.
Summing it all up
Roblox does much more to secure their platform than what I wrote on this page, but this should cover a good amount of their Lua-related security. If you want a to-do list to protect your Lua runtime for commercial applications, it would look something like this:
- Strip the compiler from the runtime linked to your application, forcing it to accept only precompiled bytecode of your own format.
- Reimplement the Lua instruction set to ensure incompatibility with the original implementation, adding in bitwise obfuscation and other algorithms making it difficult to reverse the instruction set.
- Change the structures’ members around within their struct definition, making their layout incompatible with original Lua code. Additionally, obfuscate pointers to confuse reverse engineers.
This, of course, requires an additional level of automaton, both for compiling your code and for obtaining it (if you are following Roblox’s server-to-client bytecode transfer model). Nevertheless, I believe it’s worth the effort if you want to properly secure your Lua runtime! It worked for Roblox, it can surely work for you in this situation.