Lately all I've been doing is data science but I've been trying to keep up with some of the cool work happening in the cybers as well. One project I think is especially cool is the Joern Ghidra2CPG project. 

https://twitter.com/fabsx00/status/1466302205019971586?s=20
https://joern.io/blog/joern-supports-binary/

The theory is that you can use the Ghidra decompiler, then have a code property graph, which they store in a special purpose in-memory graph database (that should probably be ported to A REAL GRAPH DATABASE). Then you can make queries in scala (ugh) against that DB to find bugs. 

One example is here:
https://github.com/joernio/query-database/blob/main/src/main/scala/io/joern/scanners/ghidra/UserInputIntoDangerousFunctions.scala

Has anyone else tried using this?

It'd be cool instead of doing source sink to do clustering and missing link analysis and MLlib against the graph database. Also a real graph db might be able to scale better... but regardless, this is the kind of cool project I hoped to see when Ghidra first came out!

This paper as well seems quite relevant: https://twitter.com/0xadr1an/status/1466518964029169672?s=20
The Convergence of Source Code and Binary Vulnerability Discovery – A Case Study
https://www.s3.eurecom.fr/docs/asiaccs22_mantovani.pdf


-dave