Lately all I've been doing is data science but I've been trying to keep up
with some of the cool work happening in the cybers as well. One project I
think is especially cool is the Joern Ghidra2CPG project.
The theory is that you can use the Ghidra decompiler, then have a code
property graph, which they store in a special purpose in-memory graph
database (that should probably be ported to A REAL GRAPH DATABASE). Then
you can make queries in scala (ugh) against that DB to find bugs.
One example is here:
Has anyone else tried using this?
It'd be cool instead of doing source sink to do clustering and missing link
analysis and MLlib against the graph database. Also a real graph db might
be able to scale better... but regardless, this is the kind of cool project
I hoped to see when Ghidra first came out!
This paper as well seems quite relevant:
The Convergence of Source Code and Binary Vulnerability Discovery – A Case