Since the National Security Agency publicly released the software reverse engineering (SRE) tool suite, we have been working to integrate Ghidra into our Pharos malware analysis tool. Ghidra provides many useful reverse engineering services including disassembly, function partitioning, decompilation, and various other types of program analyses. As this post details, we have been developing a new suite of tools, known as Kaiju, for malware analysis and reverse engineering to take advantage of Ghidra’s capabilities and interface. Ghidra provides a compelling environment for reverse engineering tools that are relatively easy to use during malware analysis. The tools included with Kaiju give malware analysts many advantages as they are faced with increasingly diverse and complex malware threats.
Ghidra supports running a number of user-developed plug-ins at the same time. We wanted to leverage this feature to integrate several tools we developed for more accurate static analysis of executable code. Having a common graphical user interface (GUI) makes it easier to keep track of the extra features being presented. Tighter integration of the tools also ensures lower-level analyzers get executed first, so gaps in the analysis are filled in and wrong facts are corrected prior to being used by higher-layer plug-ins. A common framework also makes managing all the plug-ins easier to install and run.
Kaiju includes Ghidra/Java implementations of many features of the CERT Pharos Binary Analysis Framework, particularly the function hashing and malware analysis tools. In this post, we discuss Kaiju and the tools included that use Ghidra’s analytics. We also describe the Pharos tools that we have ported to work with Ghidra as part of Kaiju. Finally, aside from being a framework for installing and running The CERT Division’s Ghidra plug-ins, Kaiju provides several integrated utilities and services that can support reverse engineering and malware analysis.
New Analysis Tools Included in Kaiju
All the tools in Kaiju use Ghidra for analytical information and services, such as disassembly and function partitioning. Moreover, a few tools discussed below take advantage of Ghidra’s advanced capabilities.
GhiHorn is a tool that uses the Z3 Theorem Prover to reason about reachable paths through a binary. As we noted in a previous blog entry, path finding can be an insightful approach to malware analysis activities. Understanding the conditions necessary to reach a specific point in a program can be valuable for bypassing anti-analysis techniques, recovering meaningful data, and identifying interesting behaviors.
In our Pharos implementation of path finding, we used symbolic values and control flow graphs to generate constraints that help evaluate whether a path is feasible or not. With Ghidra, we switched to using data generated during decompilation. This approach has a number of advantages over our previous approach. Our older path finding tools were limited in many aspects of path analysis, such as loop analysis imprecision and sensitivity to nuances resulting from compilation and optimization. Ghidra’s decompilation and intermediate representation of program semantics (known as P-code), paired with a specialized encoding known as Constrained Horn Clauses geared towards reachability problems, offer a compelling new avenue for path finding.
Finally, a common challenge with program analysis tools concerns the difficulty in interpreting results. Fortunately, Ghidra’s GUI provides many features with which to display analysis information, such as interactive graphs. GhiHorn uses these features to show results to analysts in an actionable, useful way. GhiHorn includes two utilities to support reverse engineering: PathAnalyzer, which determines the conditions to reach a given point of a program, and ApiAnalyzer, which assesses if a behavior is present in an executable file based on interactions with the underlying system.
We are planning a dedicated blog post in the future on how GhiHorn works, so stay tuned!
Function Set Extractor and Visualizer
This utility allows an analyst to compare functions across programs in the same Ghidra project using function hashes. Figure 1 shows the Function Set Intersection Visualizer in action. Each row of the table identifies the hashes that we found in each program in a Ghidra project.
Figure 1: Function Set Intersection Visualizer
Function Hash Viewer
This tool is a graphical utility used to view function hashing data for a program in Ghidra. Figure 2 shows the Function Hash Viewer comparing different hashes for a specific function.
Figure 2: Function Hash Viewer
The Fnxrefs tool generates a cross-references table for Ghidra. The table displays addresses in a given program that are referenced by other data and/or code in the program. Figure 3 shows the Fnxrefs user interface. The table is sortable to enable analysts to easily find the most referenced functions and data in a program.
Figure 3: Fnxrefs Table
CERT Pharos Tools in Ghidra
Aside from the new tools in Kaiju, we have ported a number of existing Pharos tools to work with Ghidra. These tools were originally developed as part of the Pharos Binary Analysis Framework based on the ROSE Compiler framework. Essentially, these tools behave the same as they did in Pharos, but now use Ghidra as the analysis engine.
CERT Disassembly Improvements
Kaiju includes a utility to update Ghidra’s disassembly, based on our experiences with reverse engineering malware. This utility automatically processes undefined addresses, gaps in the disassembly, and other nuances that we often see in Ghidra projects. These improvements are implemented as a Ghidra analyzer, which can be run on demand or automatically during initial analysis of the binary. The improvements include better analysis of gaps in the file, corrected alignment issues, and the ability to uncover new code or code that was not found during partitioning. These changes are geared toward handling malware executables that may include obfuscations or code arrangements designed to thwart analysis.
Fn2YARA (Function-to-YARA) is a tool that generates YARA signatures for matching functions in an executable program. Programs that share significant numbers of functions are likely to have behavior in common, and YARA signatures make it easy to search for similar functions.
Fn2Hash (Function-to-Hash) is a tool for generating a variety of hashes and other descriptive properties for functions in an executable program. Like Fn2Yara, it can support binary similarity analysis or provide features for machine learning algorithms.
OOAnalyzer JSON Importer
OOAnalyzer (Object-Oriented-Analyzer) is one of our most advanced and best-maintained plug-ins. The Pharos OOAnalyzer tool recovers C++-style classes from executables by generating and solving constraints with Prolog. The OOAnalyzer tool produces a JSON file with information on recovered C++ classes. The OOAnalyzer JSON importer for Ghidra can import this JSON file into the Ghidra interface. This data is used to enhance the type information that is shown by Ghidra. We are now packaging the OOAnalyzer JSON importer for Ghidra as part of Kaiju instead of Pharos, as we have done in the past.
Ghidra Developer Utilities
Aside from new tools, Kaiju also includes a preliminary set of common code utilities that developers can use to create new reverse engineering tools in Ghidra. Notably, we have implemented a unified logging framework that simplifies logging code when Ghidra is run in either regular graphical interface mode or in command-line headless mode. By appropriately extending or implementing our Java utility classes and interfaces, analysts too can customize Kaiju for the reverse engineering task at hand. We welcome contributions and encourage feature requests and bugfixes in the form of GitHub pull requests to the public repo.
Kaiju Now Available in Github
Kaiju remains a work in progress, and we are continually updating it and its constituent tools. The source code and build instructions for Kaiju are available on GitHub. We welcome suggestions for improvements or new utilities that would be most useful for building new tools to support malware analysis and reverse engineering. In a future post we plan to describe the inner working of some of our more extensive Ghidra-based tools.