*Logical Operations on Knowledge Graphs:*
So if you've spent enough time looking at graph databases, you will
invariably run into people who are obsessed with "ontologies". The basic
theory being that if you've already organized your data in some sort of
directed graph, maybe you can then apply a logical ruleset to those
relationships and infer knowledge about things you didn't already know,
which could be useful? There are a lot of people trying to do this with
SBOMs and I...wish them the best.
In real life I think it's basically impossible to create a large, useful,
graph database+inference engine that has data clean enough for anything
useful. Also the word *ontology* is itself very annoying.
But philosophically, while any complex enough data set will have to embrace
paradoxes, you can get a lot of value out of putting some higher level
structure based on the text in your data.
And this is where modern AI comes in - in specific, the tree of "Natural
Language Understanding" that broke off from Chomsky-esque-and-wrong
"Natural Language Processing" some time ago.
One article covering this topic is here
<https://medium.com/@anthony.mensier/gpt-4-for-defense-specific-named-entity…>,
which combines entity extraction and classification in order to look into
finding military topics in an article.
But these techniques can be abstracted and broadened as a general purpose
and very useful algorithm: Essentially you want to extract keywords from
text fields within your graph data, then relate those keywords to each
other, which gives you conceptual groupings and allows you to make further
queries that uncover insights about those groups.
*Our Solution:*
One of the team-members over at Margin Research working on SocialCyber
<https://www.darpa.mil/program/hybrid-ai-to-protect-integrity-of-open-source…>
with me, Matt Filbert, came up with the idea of using OpenAI's GPT to get
hashtags from text, which it does very very well. If you store these as
nodes you get something like the picture below (note that hashtags are
BASED on the text, but they are NOT keywords and may not be in the text
itself):
[image: image.png]
Next you want to figure out how topics are related to each other! Which you
can do in a thousand different ways - the code words to search on are "Node
Similarity" - but almost all those ways will either not work or create bad
results because you have a very limited directional graph of just
"Things->Topics".
In the end we used a modified Jaccardian algo (aka, you are similar if you
have shared parents), which I call Daccardian because it creates weighted
directed graphs (which comes in handy later):
[image: image.png]
So once you've done that, you get this kind of directed graph:
[image: image.png]
From here you could build communities of related topics using any community
detection algorithm, but even just being able to query against them is
extremely useful. In theory you could query just against one topic at a
time, but because of the way your data is probably structured, you want
both that topic, and any closely related topics to be included.
So for example, looking for Repos that have topics that either are "#UI" or
closely related, can be queried like this (not all topics are shown because
of the LIMIT clause):
[image: image.png]
*Some notes on AI Models:*
OpenAI's APIs are a bit slow, and often throw errors randomly, which is fun
to handle. And of course, when doing massive amounts of inference, it's
probably cheaper to run your own equipment, which will leave you casting
about on Huggingface like a glow worm in a dark cave for a model that can
do this and is open source. I've tried basically all of them and they've
all started out promising but then been a lot worse than ChatGPT 3.5.
You do want one that is multilingual, and Bard might be an option when they
open their API access up. There's a significant difference between the
results from the big models and the little ones, in contrast to the paper
that just "leaked" from Google
<https://www.semianalysis.com/p/google-we-have-no-moat-and-neither> about
how small tuned models are going to be just as good as bigger models (which
they are not!).
One minor exception is the new Mosaic model (
https://huggingface.co/spaces/mosaicml/mpt-7b-instruct) which is
multilingual and four times cheaper than OpenAI but it's also about 1/4th
as good. It may be the first "semi-workable" open model though, which is a
promising sign and it may be worth moving to this if you have data you
can't run through an open API for some reason.
*Conclusion:*
If you have a big pile of structured data, you almost certainly have
natural language data that you want to capture as PART of that structure,
but this would have been literally impossible six months ago before LLMs.
Using this AI-tagging technique and doing some basic graph algorithms can
really open up the power of your queries to take the most valuable part of
your information into account, while not losing the performance and
scalability of having it in a database in the first place.
Thanks for reading,
Dave Aitel
PS: I have a dream, and that dream is to convert Ghidra into a Graph
Database as a native format so we can use some of these techniques (and
code embeddings) as a native feature. If you sit next to the Ghirda team,
and you read this whole post, give them a poke for me. :)
So my first thought is that performance measurement tools seem exactly
aimed at a lot of security problems but performance people are extremely
reluctant <https://aus.social/@brendangregg/110276319669838295> to admit
that because of the drama involved in the security market. Which is very
smart of them! :)
Secondly, I wanted to re-link to Halvar's QCon keynote
<https://docs.google.com/presentation/d/1wOT5kOWkQybVTHzB7uLXpU39ctYzXpOs2xV…>.
He has a section on the difficulties of getting good performance
benchmarks, which typically you would do as part of your build chain. So in
theory, you have a lot of compilation features you can twiddle when
compiling and you want to change those values, compile your program, and
get a number for how fast it is. But this turns out to basically be
impossible in the real world for reasons I'll let him explain in his
presentation (see below).
[image: image.png]
A lot of these problems with performance seem only solvable by a continuous
process of evolutionary algorithms - where you have a population of
different compilation variables, and you probably introduce new ones over
time, and you kill off the cloud VMs where you're getting terrible
performance under real-world situations and let the ones getting good or
average performance thrive.
I'm sure this is being done, and probably if I listened to more of Dino dai
Zovi's talks I'd know where and how, but aside from having performance
implications, it also has security implications because it will tend
towards offering offensive implants value for becoming less parasitic and
more symbiotic.
-dave
The Computer Science department at Louisiana State University (LSU) is
currently hiring for many faculty positions related to applied cyber
security. Courses taught inside this department include reverse
engineering, malware analysis, binary exploitation, memory forensics
and other intensive courses related to incident response and offensive
security.
Ideal candidates will have significant experience with deeply
technical areas of cybersecurity. LSU was recently granted the CAE-CO
designation and is one of only 21 schools nation-wide to hold it as it
is the most technical designation granted by NSA and DHS. The
department also runs a large SFS program for cyber security students.
If you are interested in one of these positions, then please see the
following link. I also ask my industry contacts to please spread the
word within academic communities that you have access to:
https://lsu.wd1.myworkdayjobs.com/en-US/LSU/job/3325-Patrick-F-Taylor-Hall/…
The cybersecurity effort at LSU has strong support from the highest
levels of the school and is rapidly expanding – so now is the perfect
time to join.
PS: I am not employed by LSU, but do work very closely with the CS
department to ensure the courses are relevant to industry and rigorous
enough for students to leave with real-world, hands-on experience. If
you have questions related to the position, then please direct them to
Dr. Golden Richard at LSU: https://www.cct.lsu.edu/~golden/
Thanks,
Andrew
Call For Papers 2023
Tired of your bosses suspecting conference trips to exotic locations being just a ploy to partake in Security Vacation Club? Prove them wrong by coming to Helsinki, Finland on May 4-5 2023! Guaranteed lack of sunburn, good potential for rain or slush. In case of great spring weather, though, no money back.
CFP and registration both open. Read further if still unsure.
Maui, Miami, Las Vegas, Tel Aviv or Wellington feel so much sunnier once you’ve experienced the lack of infinity pools in Northern Europe. Instead of pools and palms trees, we can offer you actual saunas and a high tech environment, which is a weird combination of demoscene, widespread Linux adoption, mobile Internet with uncapped flat rate data and a long history of IRC and imageboards.
What defines a conference? For t2 it has always been that intimate welcoming atmosphere of a small event, which makes both audience and speakers approachable. There are enough regulars to create the feeling of a community, but not too many that a first-comer would feel being left out. On the content side, we have always been and always will be a technical security conference, emphasizing the cutting edge, world class research. This is an event for the community. Our focus[1] is on technical excellence, not politics or player hating.
t2’23 offers you an audience with a taste for technical security presentations containing original content. This is your chance to showcase the latest research and lessons in EDR simulation and healthcheck spoofing, hardware insecurity, inferring information from interference, cloud-scale forensics or persistance automation, new vulnerability classes, AI exploitation, virtual machines inside parsers, elegant exploitation of old vulnerability classes, modern defense, dropping zero days during presentations, state of the art memory corruption mitigation bypasses, evasions, safe cracking, satellite and space security, remote vehicle access, or whatever research lights up the eyes of seasoned conference visitors. For the hackers by the hackers.
The advisory board will be reviewing submissions until 2023-03-17. Slide deck submission final deadline 2023-04-20 for accepted talks.
First come, first served. Submissions will not be returned.
Quick facts for speakers
+ presentation length 60-120 minutes, in English
+ complimentary travel and accommodation for one person[6]
+ decent speaker hospitality benefits
+ no marketing or product propaganda
Still not sure if this is for you? Check out the blast from the past[2].
The total amount of attendees, including speakers and organizers is limited to 99. Advisory Board recognize[3] the OG Finnish sauna culture is an acquired taste and can promise the lack of sweaty, partially or fully nude sauna-goers at all conference functions.
[0] hunter2
[1] https://t2.fi/about/eula/
[2] https://t2.fi/schedules/
[3] https://youtu.be/Oj8JBBAM5jY?t=40
[6] except literally @nudehaberdasher and @0xcharlie
How to submit
Fill out the form at https://if.t2.fi/action/cfp
How to register
Buy your ticket at https://t2.fi/registration/
(Note, this is a continuation of our previous story chapter since sometimes
it's more fun to read fiction than to wonder what's going on these days
with Cloudflare or whatever.
https://lists.aitelfoundation.org/archives/list/dailydave@lists.aitelfounda…
)
Chapter 2
_________________________________________________
Landing in Miami is like visiting a tier of hell just below Limbo. It is
not saturated in evil so much as the established gateway to more evil
places. As you disembark from your flight you can almost see a direct line
from providing no-questions-asked banking to drug dealers in the eighties,
to offering an endless series of apartments (aka money hiding spots) to the
Venezualan upper class, to the current endless series of crypto companies
headquartered in the newly hip Brickle office spaces next to SmileDirect
and fancy brew pubs.
In the sense that NYC deals in finances, San Francisco in software
companies, Boston in "higher education", Miami is more about your more
generic small-scale scams as the underlying substrate upon which the rest
of the economy is based. The tropics engender a sort of flexibility and
adaptability which is about finding new scam-niches and exploiting them
before anyone else has caught on.
But your meeting here is not about crypto-coin or real estate built with
permeable concrete guaranteed to spall in the face of salt-water-laden
winds. It's with a company building testing software of all things. "Boring
is rewarding" you say to yourself, as you drive past a literal graveyard to
a small joint called "Hush" which you give an approving nod to.
Hush serves fried alligator, which tastes like fried anything, as you sit
across from your lunch companions, Stewart and Amy. They are drinking beers
you've never heard of, and they lay out their scheme, without regard to
OPSEC since nobody in this restaurant other than you would likely
understand it.
"We've been building a large set of unit testing libraries for
cryptographic primitives, lots of complicated string building stuff,
machine learning, you name it."
"Great." You say. "Always good to have quality testing libraries". But they
exchange a look and you realize you've misinterpreted them.
"Our public libraries have a tendency to ... sometimes think things are
very well written and secure when they are ... not. It's just sometimes our
unit testing has bugs, you know? We have really good documentation in a lot
of languages though. And great support. 24/7. Discord, Slack, forums, you
name it."
"So the theory here is you don't target any particular software in the
supply chain? You just encourage bad testing practices?" You're pondering
their value, while at the same time trying to think about what alligator
actually tastes like under all the grease. The flavor, as far as you can
determine, is "Chewy".
Stewart struggles for a second to get the words out, like a huge machine
optimized for literally anything other than the current task of explaining
things to other humans using words. "Sometimes it's best when the check to
see if ASLR is enabled doesn't actually work, so your bugs that you find
have a chance to be exploitable. We're not in the business of putting bugs
in things. We just make the bugs you do have....better."
"I see. What about code we actually want to be secure?"
"I recommend everyone local uses our FIPS certified library, which,
admittedly, is expensive and does the same thing as our free code, but
maybe with more effort put into the actual tests themselves." Amy says this
to you without any hint of chicanery, as if this is a simple fact, almost
not worth saying. It is, you realise, a very tropical CONOP.
"I will make sure this is required by various regulations after you are
funded. I'll have my team send you the paperwork." you say. And with that,
the conversation moves to pleasant nonsense as you internally contemplate
your next flight - out of here and into the cold.
-dave
If you were at a talk at Defcon this year in the Policy track, you probably
heard someone talk about how they, as a government official, are there to
address "market failures". And immediately you thought: This is a load of
nonsense.
Because that government official is not allowed to, and has no intentions
of, addressing any market failures whatsoever. If the Government was going
to address market failures, they'd have to find some way to convince every
cloud provider from making their security features the upsell on the
Platinum package. They'd have to talk about how trying to get into
different markets means every social media company faces huge pressures to
put Indian spies on their network.
Obviously you know, as someone who did not emerge from under a rock into
the security community yesterday, that the answer to having a malicious
insider on your network is probably some smart segmentation, which we call
"Zero Trust" now.
But Zero Trust is expensive! And most social media companies are not
exactly profitable as the great monster known as TikTok has eaten every
eyeball in every market because the very concept of having people
explicitly choose who their friends are is outdated now.
In fact, as everyone is pointing out, almost all companies you know are in
this position! They're cutting costs by sending jobs overseas while
spending huge amounts of money propping up their stock prices and paying
their executives to sell them to a dwindling market of buyers. Private
Equity companies spend every effort on squeezing the last dollar out of old
enterprise software by exploiting the lock-in they have on small
businesses.
And as critical as Twitter is, we have the exact same dynamic with our
privatized water and power companies - who have no plans to make strategic
investments in security or anything really - which is why on public calls
you can hear them humiliating themselves asking Jen Easterly to absorb the
entire costs of their security programs.
The ideal practice for all of these companies is to offload their costs
onto the taxpayer, which is why instead of investing in security, they cry
for the FBI to go collect their bitcoin from whatever ransomware crews are
on their network this week using offensive cyber operations that themselves
cost the government an order of magnitude more than the bitcoin is worth.
As you're sitting in that Defcon talk, listening to someone from government
talk about how they only want to regulate with the "input of industry" or
something, you have to wonder: if this is every company we know, maybe the
market failure isn't just how hard it is to buy a good security product
because they all abuse the copyright system to avoid any kind of
performance transparency. Maybe it's also how hard it is to SELL a good
security product because every single company is trying to cut their budget
to the exact minimum amount that will allow them to tell the FBI they did
their best, and the FBI needs to go out there and pick up their slack.
-dave
As you wander the halls of the inaptly named Caesar's Forum, amidst a
living sea of the most neurodiverse Clan humanity has ever seen, you cannot
help but stop for a second to close your eyes amidst the cacophony and
mentally exclaim, "Look. Look at the world we have created!"
Sitting in the one cafe in the Paris hotel with food, a
tattooed thirty-something who has been to Defcon twice gives you advice on
how to do the conference. "Take the unirail." they say. "Also, you should
have a hacker name! Mine is 'youngblood''"
"Noted!" you respond. These are good ideas. The unirail in particular,
probably, because Vegas is overflowing - and decent food options and
anywhere to sit that is not beeping at you or showing grungy dystopian TV
ads the Cyberpunk 2077 developers would find over-the-top are impossible to
come by, making the conference ten times more exhausting than usual.
In that sense, you miss the Alexis Park days, sitting with Halvar Flake
next to a pool where everyone was more larval than they knew, watching
Dildog lauch BO2K to a thousand screaming fans in the same room Dino Dai
Zovi explained Solaris hacking an hour earlier.
Some of the best talks this year had no attendees at all - Orange Tsai's
talk was over Zoom, to a huge room, but with few butts in the seats. There
were a hundred "Villages" it seemed like, living a half-life between
physical space in the conference room and a Discord channel.
Defcon may be the worst and best place to learn anything in that way - the
environment is hopelessly chaotic, with two talks happening inches away
from each other, and only feet from a DJ pumping out house music. But
perhaps the best environment to learn in is the one in which you are most
inspired?
My friends, we've conquered the world. What's next?
-dave
Right now, there is a, to put it mildly, ongoing discussion between
proponents of coercion and deterrence in cyber policy, and adherents of a
new theory, called *persistent engagement.* Maybe the sum total of the
people in the argument is less than a thousand, but as academic circles go,
it heavily influences the US Defense Department and IC, and through that,
the rest of the world, so it is fun to watch. Also obviously it has added
to infosec Twitter drama, which of course is the most important thing in
the whole Universe.
But while I try to keep this list technical, I wanted to put it into
context for people here, so they can better appreciate the Twitter drama.
But before I do that, I want to talk about a Defcon talk I attended. I'm
not going to say WHICH talk, since it was under Chatham House Rule, but it
was about cyber policy. When I pressed someone on an aspect of their policy
efforts and how it implicated technical experts without involving their
feedback (export control around penetration testing tools) they said "Well,
that was more an expression of our country's VALUES and so we didn't need
to listen to our technical experts".
And I thought that was very interesting! Because the technical community is
highly connected and paying attention to these sorts of things in a way
that didn't used to be the case. If your message on one issue is going to
be "When our values and the technical community's values don't align, we
don't bother listening to them" then they will all know immediately, and
all your other outreach efforts might as well be wasted air.
And this is true across the board - disintermediation via cyber is now a
universal truth.
I believe you can come at the theories of persistent engagement by looking
at it from a different perspective: Instead of saying "Here's a bunch of
data about what we see in cyber, and it looks a certain way, and that way
requires a new way of thinking" you ask yourself whether the fundamental
way of dealing with conflict in international relations literature can be
simplified down to coercion and deterrence when the system is a highly
connected network. In other words, the game theory math you would use for
dyads and bilateral relationships is great for looking at nuclear conflict
because that's how the problem is presented, but doesn't scale to the
problems we have for cyber conflicts, which are about emergent effects of
much more complicated systems.
That's why it's not just different, but downright wrong, to talk about
offense-defense balances when we look at cyber or cyber-enabled conflicts.
It's why the previous international relations work on deterrence and
coercion just doesn't apply cleanly, if at all. On one side (the wrong
side) you have people saying "Cyber is not strategic because it cannot hold
ground like infantry can!" and on the other side you have people building
international relations theories based on cycles of attack, on responses
and counter-responses to aggression in the cyber domain because you can
lead an entire country around by the nose ring that is TikTok.
At some level, we are going to have to stop talking about offensive cyber
operations as a corollary of SIGINT capability, and going to look more
holistically at COGINT.
To sum it up: Complexity in connectivity introduces phase changes in
systems. We now live in a highly connected world, and this means we need
new paradigms of international relations, whether you are under Chatham
House Rule or not.
-dave
People think that finding vulnerabilities is about finding holes in code.
But at some level it's not really about that. It's about understanding that
the code itself is a hole in the swirling chaos of the world and just
letting a little bit of that chaos in allows you to illuminate the whole
thing.
Spending time in Seattle is a little bit like buying a pair of high-powered
binoculars to look down the train tracks at that weird light that's heading
towards you. Seattle is a city perpetually timeless and jet lagged - as far
away as a giraffe's head from the country's dual beating hearts of New York
City and DC.
The city rests on an absolute bedrock of code. Code that feeds on lives
everywhere as voraciously and implacably as a blue whale gulping krill. In
that sense, the inhabitants of Seattle are those who have realized it's
better to be on top of the whale than inside it. It is perhaps why all the
architecture is as boxy as an early software package. If you pulled the lid
of any of the buildings next to the water you might see the packaging for a
Windows 95 CD ROM, or a bunch of floppies with a forgotten database.
When you go running past all these horribly efficient buildings down to the
water on the lone sunny day, you will be surprised with a bunch of naked
people, stripping down next to an old industrial park turned into a
playground, covering themselves in body paint before some eldritch
streaking ritual for the parade over the hill. Around them buzz
photographers immortalizing the moment. Memes infecting other memes like an
endless series of smaller wasp larvae.
Flying back to Miami, amongst the bridal parties and vacationers, over an
endless survey of drying rivers and lakes, the ravages of unchecked climate
change exposing raw pale edges the exact beige color of army pants. The
whole country - a patchwork tinderbox of exposed nerves.
With the right kind of eyes you can see a little bit of chaos being let in.
-dave
If you've walked through the Underworld long enough, you've run into
demons. Or maybe it's the other way around - by running into enough demons,
you might realize you are walking through the Underworld. And by making
friends with them, if you are lucky, you might realize you are a demon
yourself.
[image: image.png]
My brother in Zeus - this is just tempting the Fates.
Every so often an exploit from the Underworld is found. Maybe one or two a
year is dragged screaming curses in a long-dead language out into the
sunlight, pinned against a Kaspersky GReaT blogpost, and vivasected for the
world.
Sometimes these are simple bugs, with complex exploitation chains.
Sometimes these are complex bugs, but with reliable simple exploit chains.
Occasionally you see a host of bugs, all linked together like fire ants
fording a stream. If you've walked through the Underworld enough you'll
simply nod in recognition of them, perhaps stop to admire the artwork of
the Runes carved into their skins by some unknown spellcrafter.
My point is this: it doesn't matter what the real-world utility is for an
exploit, because demons don't care. They operate partially in the future,
perhaps. Or maybe ignoring real-world utility evolved as a sense of
necessity of staying ahead of the eyes hunting for them. I'm not sure. But
my rule - a core axiom of persistent engagement - is that if it can be
done, it is being done already.
-dave