>>What matters is speed of the exploit and speed of patching. And I see the humans (patching) on the losing side of this race.
This is probably an independent issue ( imvho ).
Re LLMs and present AI / ML regime, my only public comment is that
we're in the Hindenburg [1] era .. caveat emptor. Another insightful
paper that probably will be ignored this summer:
https://arxiv.org/abs/2308.03762 ( author :
https://people.csail.mit.edu/kostas/ )
[1] - https://en.wikipedia.org/wiki/LZ_129_Hindenburg
After spending some time looking at "Secure by Design/Default" I have no
doubt many of you feel like something is missing - something that's hard to
put your finger on. So you go back to the treadmill of reading about bugs
in Palo Alto devices, or the latest Project Zero blogpost, or something the
Microsoft Threat Team is naming RidonculousBreeze, or whatever.
For those of you who chose to read the latest Project Zero post, one way to
look at Mateusz Jurczyk's vast destruction
<https://googleprojectzero.blogspot.com/2024/04/the-windows-registry-adventu…>
of the Windows Registry API, resulting in what can only be described as a
"boatload" of Local Privilege Escalations, is that securing legacy code is
hard, there's a talent shortage in how many people want to do the reverse
engineering work necessary to understand and fix complicated and critical
old code, and our investments in automated security engineering toolkits
and better software development practices, while valuable, have not paid
off in the kind of hardened Rust-only systems we dreamed about.
Another way to look at this kind of wholesale destruction, a true tour de
force, is that you cannot both put advertisements in your Start menu, and
develop a secure operating system, for reasons that are more philosophical
than technical.
It's ironic that it is often Google that demonstrates this about other
vendors, when of course, the lack of any ad blocking in Chrome or Android
presents the exact same dilemma. You can't both make your systems secure,
and sit beside the great river of Advertising Revenue with a ladle, dipping
it in every quarter to fill up a cauldron of greater and greater value for
the shareholders. It's hard to draw a straight line from an internal
PowerPoint slide saying "Ads in the Start Menu are a good idea, actually"
to the inevitable conclusion of 0days, ransomware, and US Government emails
are being read by some old Russian who understands cryptography and Azure
keys better than you were hoping.
But in some respect this cause and effect is as fundamental and simple as
how that tattoo on your arm is actually there because one night you decided
to start off with shots of Limoncello.
When Project Zero started, and even when it got to the towering behemoth of
talent that it is now, I knew people in the offensive industry who were
quite scared of it - of the possibility that a large and funded team of top
researchers, with access to one of the only five real computers on the
planet, could drain the lake of software vulnerabilities we all fished in.
But I had no such fears. An organization so dependent on advertising
revenue to survive can no more fix systemic security issues than a Sperm
Whale can medal in Olympic Skiing. It is contrary to their very nature,
although they will probably smash a bunch of trees on the way down.
Like many of you, I spent my Saturday porting code to use LLAMA3:70b,
largely by annoying my 18yo with questions about ollama and Docker, since I
find modern Linux system administration as foreign as an octopus finds
calculus.
But search engines, like surface warships, are clearly on their last legs.
They went from something you used every day, multiple times a day, to
something your LLM uses for you, as just one tool among many. It is, for
reasons that must be obvious even to executives drunk on the heady fumes of
their stock options maturing, hard to make money selling advertisements
that are only read by LLMs.
But having spent the better part of a couple years doing LLM work now, I
feel like I understand why these behemoths are investing so much money in
them, despite the obvious cannibalization of their cash umbilical. It's
because they can!
There's just not that many businesses that generate ten billion dollars of
revenue year on year to get into. You've got some elements of
manufacturing, tech, education, health care, video games. It's not a big
list. Apple gave up on manufacturing cars because the niche they wanted
(impractably weird and expensive) was already filled by Tesla.
But by investing in LLMs and AI in general you kinda get to put your thumbs
in every other billion dollars business all at once. It's a straight line
shot from something you already know, to the next place. So of course, they
are throwing dollars at it like it was the only thing they knew how to do.
And what we get is the pretentious superiority of ChatGPT, or the
sanctimonious holiness of Claude, or the ever-sadness of Gemini, the
impertinence of Mistral or the trollishness that is LLAMA. A world of
chaos, yet something so familiar.
-dave
On Monday, I and 400 other people, including many on this mailing list,
attended Sophia's funeral in a huge church in the upper east side of NYC.
Although I grew up in a Jewish household, I am not religious, and the last
time I went to a church was also with Sophia, in Jerusalem, where we
wandered through various landmarks until we ended up at the Church of the
Holy Sepulcher, one of the holiest sites for Christianity.
We waited in a line of fellow pilgrims, the room lit only by echos and dim
lamps to touch the Stone of Anointment, where Jesus's body was said to have
been prepared for burial. The rock was wet, from some unknown source
below. We each knelt and touched it briefly, respectfully. The concept of
prayer is foreign to me, but I spoke to myself for a moment with my hand on
the slick rock, and tried to feel.
Afterwards, Sophia said to me, "You can't expect to _feel_ anything.",
something I've often pondered since that moment.
Sophia had many friends, and for many people, Sophia was one of their
closest friends. No one person can own her memory, but like many others,
the moments I had with her were important to me. Now that she's gone, these
moments are what we have left.
The community that came to NYC to support Sophia's family, and each other,
was just as stunned as I am. When not sharing memories about Sophia, we
talked about exploits, and large language models. I saw people I hadn't
seen for over a decade, or people I knew, but had no idea knew Sophia. Some
people traveled from across the entire globe to say goodbye. The community
was all on the same path, and the many photos of Sophia laid out on the
reception tables were like a shattered mirror portal into her life.
Woven into the fabric of who she was, was Sophia's work, and her desire to
do what was necessary to get the work done. She did not start a company to
get rich. She lived a life of quiet professionalism and when global events
moved, she was often part of that fell force that moved them. The shape of
the world is a part of her legacy, but even more so I see her legacy as an
example that you can live a life of sacrifice and integrity. You can be
there for people. You can make it look effortless.
This week the Host gathered around a silence folding endlessly upon
itself. I feel privileged to have known her, to have shared the time we
shared. I know those of you reading this who were also a part of her life
feel the same.
Rest in Peace, Sophia. We miss you.
-dave
[image: image.png]
Like everyone I know, I've been spending a lot of time neck deep in LLMs.
As released, they are fascinating and useless toys. I feel like actually
using an LLM to do anything real is your basic nightmare still. At the very
minimum, you need structured output, and OpenAI has led the way in offering
a JSON-based calling format which allows you to extend it with functions
that cover the things an LLM can't really do (i.e. math, or access the web
or your bash shell). In real life you are going to use this through
LangChain or some similar library.
You can do this sort of thing with Claude (a better model than GPT-4 in
many respects for code), but it's janky as the model wasn't specifically
fine tuned for this purpose yet. Your best bet, as you see everyone do, is
to force it to start its reply with a curly bracket "{", but even then it's
going to pontificate about its reply after it sends you the JSON object you
want, if you're lucky and it uses your format at all.
Claude is more based on XML than JSON, which, if you think about how LLMs
work, makes a ton of sense. To an LLM, {' may be one token, or {{{{ may be
a token. In fact, let's test that:
>>> import tiktoken
>>> encoding = tiktoken.encoding_for_model("gpt-4")
>>> encoding.encode("{")
[90]
>>> encoding.encode("{{")
[3052]
>>> encoding.encode("{{{")
[91791] #one token
>>> encoding.encode("{{{{")
[3052, 3052] #two tokens
>>> encoding.encode("{{{{{")
[3052, 91791] #two tokens
>>> encoding.encode("{{{{{{")
[3052, 3052, 3052] #three tokens
>>> encoding.encode("{'")
[13922] #{ ' is one token
Yeah, so like, on one hand, that's great. Very optimized compression on
token lengths. But on the other hand, it is very confusing for the model to
train on and understand! You can see why XML would be much more natural! <
SOME WORD > is more likely to be three tokens, which makes creating clean
output much easier. Claude's focus on XML probably makes it "smarter" in
some ways that are hard to prove with math.
>>> encoding.encode("<high>")
[10174, 1108, 29]
>>> encoding.encode("<html>")
[14063, 29]
>>> encoding.encode("</html>")
[524, 1580, 29]
>>> encoding.encode("/html")
[14056]
>>> encoding.encode("</high>")
[524, 12156, 29]
>>> encoding.encode("high")
[12156]
Also, of course, I highly recommend Halvar's latest talk (which is highly
relevant):
https://www.youtube.com/watch?v=xA-ns0zi0k0&t=4s
-dave
[image: image.png]
The security community (aka, all of us on this list) still rages with the
impact of Jia Tan putting a sophisticated backdoor into the XV package, and
all of the associated HUMINT effort that went into it. And I realized from
talking to people, especially people in the cyber policy realm but also
technical experts, about it that there's a pretty big gap when it comes to
understanding why someone would put in a backdoor at all, instead of adding
many bugdoors.
Some Background:
1. A post
<https://cybersecpolitics.blogspot.com/2019/05/hope-is-not-nobus-strategy.ht…>
on what NOBUS means when it comes to backdoors.
2. Responsible offense from a bunch of Americans
<https://www.lawfaremedia.org/article/responsible-cyber-offense>
3. Responsible offense from the UK
<https://www.gov.uk/government/publications/responsible-cyber-power-in-pract…>
4. Responsible offense from the Germans
<https://www.stiftung-nv.de/sites/default/files/snv_active_cyber_defense_tow…>
5. A university banned from Linux
<https://www.theverge.com/2021/4/30/22410164/linux-kernel-university-of-minn…>
for contributing backdoors as part of a research project
So as with all areas of responsible offense, there is a tight connection
and contention between good OPSEC and responsible operations. In
particular, it is very easy to get yourself on a team for a big project,
and add code that introduces exploitable conditions, perhaps handles input
in a way that causes a memory corruption, or does authentication slightly
wrong in certain circumstances.
From an operational security standpoint, these bugdoors are easy to
introduce, and I don't know of a serious hacking group that hasn't played
with this - if for no other reason than to fix bugs that cause crashes
while you are trying to exploit some other, better bug. Reading the
original UMN paper, (which was under-appreciated for its time, despite
getting them banned from Linux!) you can see that it is not really always
about adding bugs, but often about adding enabling features for bugs that
already exist in the codebase, making them more reachable.
In some ways, attacking the open source community by hacking into
developers or repositories has been the traditional way of ancient Unix
90's hackers, who understand a web of trust the way a Polynesian navigator
understands the swells and currents between islands.
From an opsec perspective though, bugdoors have limits. Fuzzers can find
them, other hackers can find them, and once found, they can be used by
anyone with the skill to write the exploit. Likewise, using them is risky:
No memory corruption is 100% reliable, and when they fail, *they fail in
the worst way, in the worst place, at the worst time*. Likewise, the
traffic you may have to do to shape memory in the target host is likely to
be anomalous, and easily signatured.
And from a responsible offensive cyber operation perspective,
bugdoors cannot mathematically demonstrate that they can protect the hosts
you target from third parties. *Bugdoors are never a NOBUS capability.*
Ideally a NOBUS capability would allow you and only you to get in and avoid
replay attacks, but a close second is a simple asymmetric key of some kind
where the target ID is used as part of the scheme. The XZ backdoor
used an Elliptic
Curve with a signature that included the target's SSH public key
<https://github.com/amlweems/xzbot>.
Thanks,
Dave
Dear Daily Dave,
For a hacker conference, twenty years is a huge achievement — for a small conference, even more so. Over these years we’ve enjoyed speakers showcasing results from cutting-edge research, seen thought-provoking keynotes and bonded with other like-minded people from all over the world.
If we had to summarize the experience with one word, it would be gratitude. The speakers, repeat speakers, first timers or regular attendees, and friends of t2 — you have made the event and its atmosphere.
This was always a true community event – it’s organized for hackers, by hackers. The Advisory Board’s motivation and main driver was our love for the game. Creating a small event with a curated program, offering a backdrop for lobby bar and coffee break discussions was (and still is) our vision of a perfect infosec con. The chance to network with your industry peers was as integral part of t2, as the high quality content.
It’s rare you get the same level of interaction with current/former speakers and attendees alike at any other conference.
Tomi has fond memories of pretty much each and every year. Starting from the humble beginnings, when the legendary Phenoelit guys were kind enough to come and present at a conference that back then had no history nor reputation. How the Toolcrypt guys dominated the stage for years with their absolutely amazing research, and how Ivan Krstić (now Apple’s security samurai) shared his ideas on how modern security architectures should be built (iDevices anybody?), how the InversePath crew delivered some of the most enjoyable and hardcore research ever and well, you get the idea – the list just goes on and on – there are simply too many good memories to list here.
Mikko remembers learning from Ludde (during a t2 coffee break) how he works at Spotify. Then Mikko explained how impressed he was with Spotify’s early beta version, especially how you could skip parts of a song and it would still continue streaming instantly. Ludde nodded…and said ‘yeah…I coded that’.
Henri still reminisces how Halvar Flake took the time after his talk in 2010 to have a chat with him and Esa Etelävuori (RIP), despite Halvar feeling slightly under the weather in the midst of what later turned out to be the Zynamics acquisition by Google. In 2017 we enjoyed the late night/very early morning pizza in the hotel lobby bar with Dave Aitel, after proving him wrong.
Instead of dropping a surprise announcement sometime next year, or silently disappearing into the crowd, we wanted to let everyone know before this year’s t2 infosec that 2024 will be our last dance.
We have truly enjoyed the past two decades of world class cyber in Helsinki – all good things come to an end eventually. From the bottom of our hearts, a big thank you to all of you who made this happen. We are privileged to be able to call many of you out there our friends.
This goes especially to Dave, thank you for treating us so well over the years.
Tomi Tuominen
Mikko Hyppönen
Henri Lindberg
There seem to be a lot of people who think the problem with cyber security
is we aren't paying lawyers enough. This results in the current push for
software liabilities, or the need to click accept on cookies before we use
every website. It is natural for lawyers to want to feed the
next generation of associates, by regurgitating legal koans into their
mouths. These vomitous truisms pass for thought leadership when you go high
enough into the cyber policy clouds. "We don't know what we don't know!",
"You can't manage what you can't measure!" , "We need to be Secure to
Market not First to Market!", "Crawl, Walk, Run!"
These statements are the opposite of haikus, which when done right are one
crystalized emotional moment. This is why I think maybe we should hire more
poets to do cyber policy, instead of lawyers. What is "*i carry your heart
with me(i carry it in*" other than the first line to an exploit written in
a bash script we all forgot existed but our spirits remembered?
When LSD-PL said:
bn,a
bn,a
call
did that not carry with it an emotive punch? When we hack, do we not reach
within our internal well of hate to pull forth a tiny amount of darkness
and then send it into the world on tiny flaxen wings? I cannot do a survey
on this in any language that matters, but I look into the net and see all
the ancient hackers I grew up with still crouched in full armor, their
ocher swords smoldering.
This week in between cyber policy calls at 0500, I sat for hours, choking
on the byzantine syntax of LangChain attempting to wrestle an LLM into
submission. I kept thinking, what would Horizon do? What would Shubs do?
What would James Kettle do? What would Tiraniddo do? What would Chompie do?
What would Skylar do?
I told myself: They would *continue*, is what they would do.
-dave
Windows XP and Windows 2003 partial source code is out there on github. With such a rich corpus of known vulnerabilities in those OS'es and source code availability, surely there should be an amazing amount of SAST/semgrep/codeql rules that take as input existing known exploits and then do rules that find similar things, yet I don't seem to be able to find such projects
Surely, these two code bases should be the foundation of most good CS/cyber courses - like students finding new bugs, etc.
Is source code junk?
So I have a ton of thoughts on the CISA Secure by Design and Secure by
Default push that is ongoing, as I am sure many of you do. And the first
thought is: This is not a bad way to go about business as a government
agency in general. I think it's easy to ignore how fast the USG has changed
its business practices, showing an agility that few large organizations can
match. In particular using Secure By Design as a case example.
1. Massive outreach to garner feedback (including at defcon, but also
via email, etc.)
2. Multiple rounds of editing of proposals
3. Actual people you could call and talk to about the proposal, with
their faces and positions listed right in the papers and blogs and lawfare
podcasts. If you were in DC today you could probably hit one of them up for
drinks or lunch or whatever.
4. Interaction across multiple stakeholder groups, including
internationally
5. The "right people" involved - and you can tell their backgrounds from
what they are annoyed about during their podcasts and other presentations.
(i.e. Bob Lord is very annoyed about XSS and obsessed with car safety,
which I'll dig into later). But also Jack Cable, Lauren Zabierek and Grant
Dasher are all worth listening to.
6. Clear executive support
So that's all good stuff. I thought I would post it as its own note because
it's rare to spend a moment to look at the government process, and not see
literally sausage being made. :)
-dave
So I wrote a little draft essay on Secure By Default and opened it for
comment. I think one thing that we maybe forget in our community is that
some of the more fundamental basises of what we do never make it up to
policy-world. Langsec being the primary example. But also there's a huge
body of work in TAOSSA, Shellcoders, every offensive conference talk, etc.
that never gets put into context anywhere but in our clique.
Obviously feel free to just comment in-thread if you prefer, even if you
work at CISA:
https://mastodon.social/@dave_aitel/111779922142416342
Thanks,
Dave