woodruffw 3 days ago

Capstone supports an impressive breadth of architectures. However, if all you need is x86/AMD64 decoding and disassembly, there are much higher quality (in terms of accurate decoding) libraries out there.

I wrote a differential fuzzer for x86 decoders a few years ago, and XED and Zydis generally performed far better (in terms of accuracy) than Capstone[1]. And on the Rust side, yaxpeax and iced-x86 perform very admirably.

[1]: https://blog.trailofbits.com/2019/10/31/destroying-x86_64-in...

  • monads 3 days ago

    In my previous job, I've worked on a project that requires disassembling large amounts of x86/amd64 instructions (several billion instructions each running is very common). I've found also that Zydis is much faster than Capstone.

  • meisel 3 days ago

    How is there any discrepancy in accuracy? Isn’t it just a matter of following the spec?

    • woodruffw 3 days ago

      The spec is very large, not particularly well written, and is not “total” (in the sense that AMD64 and IA32e and other x86-64 flavors are all subtly different). There are a lot of ways to get it wrong; even XED (the reference decoder from Intel) has bugs.

      If I remember correct, the Intel SDM alone is over 3000 pages long.

    • saagarjha 3 days ago

      lol, no. For one Capstone has a lot of bugs (it uses some old version of LLVM as its base) but the whole question of how to decode things is complicated because there are a lot of pitfalls and inconsistencies that different disassemblers handle differently. And what the hardware does is a different question entirely: it may not match the spec, or even other processors with the same ISA.

      • xvilka 3 days ago

        It just updated to the nearly latest LLVM, so that argument is void: https://github.com/capstone-engine/capstone/blob/next/docs/c...

        • saagarjha 3 days ago

          I'll believe it when I see it. If I can go a few years without wasting time during a CTF because of an incorrect decode I'll change my tune.

          • woodruffw 3 days ago

            This has been my experience as well. I’ve had to rip Capstone out of more research projects than I care to admit.

  • canucker2016 3 days ago

    Did you mean x86/x64 decoding?

    Looking at the libs, none of them seem to mention ARM64 inst. decoding.

    • woodruffw 3 days ago

      Yep, I meant AMD64, fixed.

jstrieb 3 days ago

Capstone is very useful!

Someone (not me) has also cross-compiled Capstone to WebAssembly so it can be used in client-side browser applications.

https://alexaltea.github.io/capstone.js/

I've used this in a couple of projects to support disassembly in static web apps with no back end.

__alexander 3 days ago

If you find Capstone interesting, check out the Unicorn Engine.

https://github.com/unicorn-engine/unicorn

Also, if anyone is interested in an example of using capstone for basic disassembly and analysis, here is a link to my capstool project.

https://github.com/alexander-hanel/capstool

  • emmanueloga_ 3 days ago

    Right, three related multi-platform and multi-architecture frameworks from the same people:

    * Capstone: disassembly.

    * Keystone: assembler.

    * Unicorn: CPU emulator.

  • the_biot 3 days ago

    Unicorn is fantastic. I used it to emulate an SoC's boot environment to get around a very weird HAL, and it worked perfectly. Awesome tool!

nicolodev 3 days ago

Another good replacement for capstone/keystone based on LLVM is nyxstone https://github.com/emproof-com/nyxstone

  • xvilka 3 days ago

    It's just a wrapper around LLVM. So any project would be forced to ship also the corresponding LLVM version, if it's not present on the system - e.g. for Windows or embedded applications. A bit too much for a simple disassembler. So it's not a direct replacement for Capstone.

  • saagarjha 3 days ago

    That's basically what Capstone is? Except not vendoring its own LLVM.

    • xvilka 3 days ago

      Capstone doesn't vendor LLVM either. It just contains some pieces of the LLVM-ish infrastructure that were converted from C++ to the pure C and are pretty lean, without any external dependency.

  • ashvardanian 3 days ago

    It looks pretty promising! How would you compare the strengths/weaknesses?

    • stuxnot 2 days ago

      Full disclosure, I'm one of the nyxstone developers - so I might be biased.

      In comparison to capstone, nyxstone lacks the features of instruction decomposition and providing read/written registers. In addition, nyxstone directly interfaces with LLVM and thus is expected to be a lot slower than capstone, which uses instruction tables generated by a modified LLVM.

      I want to note here that Nyxstone is intended more as a replacement for Keystone than Capstone. We added the disassembler mainly because we could. Compared to Keystone, nyxstone allows precise definition of target triple and ISA extensions, allows definition of external labels, supports structured output with instruction details (address, bytes, assembly), rejects partial and invalid inputs and rejects instructions not supported by the specific core (for example UMAAL is supported by Cortex-M4, but not by Cortex-M3), and is more up to date. Nyxstone does not require patches in the LLVM source tree, and thus is (I'd argue) more maintainable and easier to keep up to date.