I continue to believe there are a lot of interesting questions around
building cyber reasoning systems for vuln finding. Even the very basics
seem hard to study and understand, and the eval datasets available
are....sparse or incomplete. For example, what you really want if you're
analyzing git repos is the commit a bug was introduced, and the commit it
was fixed. But usually you get "a commit where it maybe existed".
Likewise, it's an open question what kind of tools you can provide an LLM
in this space and how it can best use them.
Anyways, if you're building a cyber reasoning system for finding bugs,
please spam me an email.
-dave