14 July, 2016

Introducing AsmTK

AsmTK - A Toolkit Based on AsmJit

AsmJit library provides a low-level and high-level JIT functionality that allows applications to generate code at run-time. The library was designed from scratch to be efficient and highly dynamic. Efficiency is achieved by having a single (dispatch) function that can encode all supported instructions without jumping to other helper functions. This function is actually pretty big, but I always tried to keep it organized and consistent. Dynamism is achieved by using a structure called Operand, which is a base class for any operand that can be used by the assembler, and guarantees that each Operand has the same size (16 bytes) regardless of its type and content.

The dynamic nature of AsmJit is actually what makes it much more powerful than other JIT assemblers out there. It's also a feature that makes it possible to have X86Compiler as a part of AsmJit without a significant library size increase; and it also makes it possible to create tools that use AsmJit as a base library to generate and process assembly at run-time. One missing feature that I have been frequently asked was to assemble code from a string. This is now provided by AsmTK library!

AsmParser

The AsmTK's AsmParser exploits what AsmJit offers - it parses the input string and constructs instruction operands on-the-fly, then passes the whole thing to the instruction validator, and finally passes it to the assembler itself. The AsmTK supports all instructions provided by AsmJit, because it uses AsmJit API for instruction name to id conversion and strict validation.

Here is a result of a sample application that I wrote in less than 15 minutes - it's basically on-the-fly X86/X64 instruction encoder based on AsmTK and AsmJit. You enter instruction and it tries to encode it and outputs its binary representation:

=========================================================
AsmTK-Test-Cmd - Architecture = x64 (use --x86 and --x64)
---------------------------------------------------------
Usage:
  1. Enter instruction and its operands to be encoded.
  2. Enter empty string to exit.
=========================================================
mov eax, ebx
8BC3
mov rax, rbx
488BC3
mov r15, rax
4C8BF8
cmp ah, al
3AE0
vandpd ymm0, ymm10, ymm13
C4C12D54C5
movdqa xmm0, [rax + rcx * 8 + 16]
660F6F44C810
movdqa rax, xmm0
ERROR: 0x0000000B (Illegal instruction)

The tool can be used to quickly verify if an instruction encodes correctly and also to check if the encoding is optimal (for example if AsmJit encodes it the shortest way possible, etc). I have already made two fixes in AsmJit to use shorter encoding of [mov gpq, u32 imm] and [and gpq, u32 imm] instructions.

Conclusion

The AsmTK library is a fresh piece of software that currently contains less than 1000 lines of code. It relies on AsmJit heavily and uses its new instruction validation API. It also serves as a demonstration of AsmJit capabilities that are not obvious from the AsmJit documentation.

2 comments:

  1. When will it be available on GitHub?
    and good work btw

    ReplyDelete
  2. Thanks! I'm finalizing this feature so it's a matter of few days at most. The most interesting part in AsmJit is already done and it's working really well.

    ReplyDelete