Is it possible to decode x86-64 instructions in reverse?

Question

An x86 instruction stream is not self-synchronizing, and can only be unambiguously decoded forward. You need to know a valid start-point to decode. The last byte of an immediate can be a 0x90 which decodes as a nop, or in general a 4-byte immediate or displacement can have byte-sequences that are valid instructions, or whatever other overlap possibilities with ModRM/SIB bytes looking like opcodes.

If you decode forward in code that isn’t intentionally obfuscated, you often get back into sync with the “correct” instruction boundaries, so you might try remembering the instruction boundaries as a known-good point, and check that a decoding from a backwards-step candidate start address has an instruction boundary at your known-good point.

IDK if you could get more clever about finding more known-good points going backwards which further candidates also have to agree with.

Be sure to highlight backwards-decoded instructions for the user in red or gray or something, so they know it’s not guaranteed reliable.

Another alternative is to require function symbols (extern functions, or any function with debug info).

GDB doesn’t allow you to scroll upward (in layout reg mode), unless you’re inside a function that it knows the start address. Then I guess it decodes from the function start address so it knows instruction boundaries when it gets to the part that fits in the window.

If you want to go backwards, you have to disas 0x12345, +16 to start decoding from there. Then you can scroll down, but if you get the insn boundary wrong you get garbage.

Leave a Comment Cancel reply