A Famed Hacker Is Grading Thousands of Programs

nickpsecurity · on July 31, 2016

I like how they keep saying first of it's kind. It was actually DOD, via Roger Schell & Steve Walker, that established the first standards for certifying security of systems based on what stopped brilliant pentesters at that point. Also the people that helped invent INFOSEC so some credibility there. ;) Landwehr's "The Best Available Technologies for Computer Security in 1983" (link below) is an approachable text of background, example systems constructed, and requirements for secure systems as of that time.

http://www.landwehr.org/1983-bats-ieee-computer-as.pdf

Esp see Advice for Developer section. For bonus points, compare those recommendations to today's "secure" apps or systems to see how much today's are missing. Then realize that the requirements and techniques for knocking out all the known risks have only increased since then. :) Far as the criteria, it worked with the private sector producing one high-assurance, secure system after another until NSA policy-makers killed the ecosystem:

http://lukemuehlhauser.com/wp-content/uploads/Bell-Looking-B...

Since then, assurance dropped greatly across the board with FOSS & non-high-assurance proprietary both forgetting about precise specs, verification, covert channels and SCM entirely. Some of that was gradually rediscovered (ex "side channels") by same people saying original criteria was useless, red tape. They keep showing an interesting ability to hold two, contradictory ideas in mind simultaneously that way. High-assurance field continues to move on with EAL7-style with demonstrators in hardware (Rockwell-Collin's AAMP7G), smartcards (MULTOS/Caernarvon), crypto (Galois' CRYPTOL), kernels (seL4), hypervisors (CertiKOS), parsers (LANGSEC), static verification (eg IRONSIDES DNS w/ SPARK), browsers (Illinois Browser OS by Tang), and so on. Definitely worth copying proven approaches in any new organization designed to certify security. Just eliminate what red-tape was there then up focus on evaluation of specific product and code. I expect Mudge to pull that off at least.

Far as how to redo evaluation process, I have a write-up below on that. Ignore the first link in the Schneier post as the Pastebin below is the cleaned-up version of it. Has the list of security problem areas and methods I used in evaluation. Originally a counterpoint to someone thinking secure coding was all that mattered.

https://www.schneier.com/blog/archives/2014/04/friday_squid_...

http://pastebin.com/9AwDLSTY

gcp · on July 31, 2016

Firefox, by contrast, “had turned off [ASLR], one of the fundamental safety features in their compilation.”

Firefox has had mandatory ASLR since 2012.

After more investigation, this appears to be about Mac OS X specifically (which isn't pointed out anywhere in the article) where the Firefox executable doesn't have the PIE bit set. But note that the Firefox executable does almost nothing, and all web rendering stuff is in a huge dynlib, which is loaded with ASLR.

glandium · on July 31, 2016

And it is worth noting that the reason PIE is not enabled on Firefox on Mac OS X is because PIE executables are not supported on OS X 10.6, and Firefox still runs on OS X 10.6. Which won't be true in Firefox 49, due to release in 6 weeks, and that has PIE enabled (by the simple fact that MACOSX_DEPLOYMENT_VERSION was changed to 10.7 (the toolchain defaults to building PIE in that case), https://bugzilla.mozilla.org/show_bug.cgi?id=1269790).

616c · on July 31, 2016

If Dan Walsh says it's complicated to run a MAC system against browsers specifically and desktop software generally, you have to worry.

http://danwalsh.livejournal.com/72697.html

I know it is debunked lower in the thread (so long as you do not load suspicious extensions, as explained), but sandboxing browsers seems prudent, and sadly why I am afraid maybe even Qubes is not enough anymore! Haha.

gcp · on July 31, 2016

ASLR and MAC are unrelated - I don't get your comment.

616c · on July 31, 2016

Securing an application that merges so many functions of a computer is conceptually a lot more complicated than it appears.

He starts with a CVE and is routinely asked but why no SELinux. Conversely, he does not get into the fun part, a JIT, which must make ASLR all kinds of fun.

I was aware it is an apple and oranges comparison, but it all kind of sucks with something as complicated as a browser.

glandium · on Aug 1, 2016

The article also doesn't say where the difference between Chrome and Safari on ASLR subscore comes from, or how Firefox does better on the Stack Guards subscore than Chrome (which I'm very surprised about, although maybe that includes the JS stacks as well?).

gsnedders · on July 31, 2016

As far as I'm aware, base Firefox has had ASLR for much longer—just extensions (because they can run arbitrary code) could disable it by loading DLLs that don't support ASLR. (DLLs because Windows was the last platform that changed to mandatory ASLR.)

gcp · on July 31, 2016

Right, so it would actually have been hard to mess that up.

The only platform where Firefox has limited ASLR is Linux, where PIE is disabled, due to issues with some desktop managers not detecting PIE executables are executable. So the default builds disable it for maximum compatibility. Distros like Debian and Fedora have fixed the bug on their side though, and just ship PIE Firefox executables.

The graph mentions Safari, so you'd think Mac OS X. But AFAIK the ASLR implementation just works on Mac OS X?

So I have no idea what the graph is supposed to mean.

glandium · on July 31, 2016

For completeness, Distros didn't actually fix the issue, they have much less to care about it: When you install a distro package, you get the application icon in your desktop environment. You don't when you download a firefox tarball, and unpack it.

armitron · on July 31, 2016

I think that the most important outcome is the pressure that will be placed upon software engineering teams to improve their products and start paying attention to security. They've gotten away without doing it for _far too long_. We may finally see a radical abandonment of C and C++, where it makes sense to do so.

I'd be cautious when interpreting the results, since their analysis (as they themselves admit) is not geared towards negatives (absence of vulnerabilities).

The obvious example here is Google Chrome. It has received a very high score according to their methodology but it gets exploited, publically at pwn2own, _year after year_.

oconnor0 · on July 31, 2016

Don't all the browsers get exploited at pwn2own?

detaro · on July 31, 2016

I think that armitron's point is that Chrome gets exploited, like everybody else, despite being named in the article as the browser with the highest score. High score != "no vulnerabilities"

Grazester · on July 31, 2016

If I am not mistake it usually takes significantly more effort to achieve these exploits compared to other browsers in the past.

armitron · on July 31, 2016

That may be the case (and the scores from Mudge methodology may be interpreted in such a way), but it is still a grey area, qualitative rather than quantitive, and it is perilous I feel to use them in such fashion.

After all what does "significantly more effort" mean? Yes there is raising the bar but do we know if _in practice_ those countermeasures make any sort of meaningful difference especially considering that a lot of the countermeasures can be defeated (ASLR -> information leaks, sandboxing -> hitting the kernel). Taking into account that the vast majority of cutting-edge offensive security research happens behind closed doors, the public has almost no visibility in this area.

If a South Korean teenager can break Google Chrome at his leisure, what about more resourceful adversaries that do not even have to be nation states?

Of course corporations like to talk exactly this kind of talk, raising the bar, "significantly more effort", better-than-the-rest and so on, since it lets them harbor the illusion that they're doing something but another way of looking at the data is this:

Google Chrome was first released in 2008. 8 years later, and it is _still_ remotely exploitable by solo individuals or small teams that release their exploits for not-a-lot-of-money. I'm singling out Chrome here because apparently it's the browser with the highest score in the OP report, but of course every other browser has the same issues.

Collectively, since the inception of the web, we have not had a browser that wasn't remotely exploitable. Can we do better? Judging from other critical software, it appears that we can.

Why haven't we??

Partly because the necessary processes haven't been there and I hope that the Mudge project will change that.

Let's move beyond smoke & mirrors to actual security.

rossng · on July 31, 2016

This is why I think Servo is particularly exciting - while it's still a few years off being a practical browser, it should be immeasurably more secure than the existing crop of crufty C++ codebases. I think it (along with Rust) is by far the most important thing that Mozilla is working on.

armitron · on July 31, 2016

I do not share your optimism re: Servo, partly because everything that has ever come out of Mozilla has been a disaster and I can't shed my prejudice, partly because Rust allows "unsafe" code (some feel this is all it takes) and partly because they've repeatedly said they'll keep using Spidermonkey (which is a security clusterfuck).

I would like to be proven wrong however, if not by them, then by others.

bla2 · on Aug 1, 2016

The payout ranges image on http://krebsonsecurity.com/2016/05/got-90000-a-windows-0-day... is somewhat quantitative. Chrome with sandbox exploit is 60% more valuable than the same for IE or Safari -- since Safari and IE are on the same tier, this is probably more due to difficulty of exploitation than due to user base.

JBiserkov · on July 31, 2016

Yes, and this year it was worse:

>every successful attack at Pwn2Own 2016 achieved SYSTEM or ROOT privileges, which has never happened at the event before.

Source: http://venturebeat.com/2016/03/18/pwn2own-2016-chrome-edge-a...

brownbat · on July 31, 2016

To be fair, the browsers get exploited because of vulnerabilities in flash year after year.

I don't want to suggest there's perfect security, there's not, but they aren't exactly attacking builds with ublock, click to play flash, and disabled javascript.

The profile of the systems is a concession that some builds would be too hard to consistently attack.

armitron · on July 31, 2016

At least as far as pwn2own is concerned, browsers are targeted without any access to plugins.

Plugins are usually introduced (some of them separately) on different days and are considered different targets.

jrowley · on July 31, 2016

My only concern is that a rating system may leave people to have a false sense of security when using software. Some very exciting steps in the right direction though.

moron4hire · on July 31, 2016

I would say, given the general behavior of most users, people already have a false sense of security.

Though maybe not. Maybe security isn't as big of a deal as we think. It's not like the world is ending. So many people are so incredibly insecure that you'd think more would come out of it and people would be incentivized to become more secure. There is a sense in which we are as secure as is cost effective.

raesene9 · on July 31, 2016

The problem is the externalities of insecurity. The company who is insecure doesn't suffer all the consequences of a security breach, so they don't spend as much money to mitigate it.

If software security was similar to say building/fire safety or food safety where serious cost/legal consequences were attached to failures, we'd likely see more spending...

moron4hire · on July 31, 2016

That's exactly what I said, "we are as secure as is cost-effective".

haasn · on Aug 2, 2016

This is my biggest worry too. It's easy to turn on ASLR and PIE and SSP and whatnot and still have tons of obvious bugs in your actual business logic.

I'm sort of worried that these automated metrics are too easy to game, and that all it will teach companies do is to use different compiler settings without actually caring about, say, hashing their passwords or authenticating their cookies.

jrowley · on Aug 2, 2016

I agree - in the worst case scenario they attempt to game the system and actually make their code more obfuscated and unsafe.

coldcode · on July 31, 2016

In a previous job we had to run our apps through Vericode to supposedly do this exact thing. I didn't think much of its advice at the time (4 years ago) but how is what Mudge is doing is any different?

moyix · on July 31, 2016

Most of the metrics, etc. are not new [1]. The two new things are:

1. Mudge has enough credibility that, at least for now, most security people trust his assessments.

2. He's willing to publicly assign grades, and take all the backlash and legal heat that entails.

Really it's the second one that will probably make the most difference here. If his organization manages to survive the first couple years without being sued into the ground, I expect it will have a big impact on the software world.

[1] From what I can tell from reading the article. They could certainly be doing more behind the scenes that didn't get reported.

lubujackson · on July 31, 2016

Sounds like a lot of marketing about a standard pen-testing business.

yoav_hollander · on July 31, 2016

I like the idea of using fuzzing "to show a direct correlation between programs that score low in their algorithmic code analysis and ones shown by fuzzing to have actual flaws".

I hope they'll use AFL, and publish the parameters / settings, so others will be able to repeat the experiments.

h_r · on Aug 1, 2016

I think that would fall in the third type of report they are planning to sell: "The third report, which they plan to sell, will contain raw data from their assessments for anyone who wants to recreate them."

api · on July 31, 2016

Something like this if done well would be a huge step forward. Maybe we can stop totally ruining the network with local isolation and fascist firewall defaults if we can get some sense of standards in place about the quality of code required to bind a network socket.

Please add routers, IoT, and other embedded software to the list. That's as much if not more of a rats nest than software on laptops and phones, especially given the lack of OS protection in those environments.

GnarfGnarf · on July 31, 2016

It will be interesting to see it the public is willing to pay a premium for "certified" software.

ratfacemcgee · on Aug 1, 2016

the photos of them in that article are pretty ridiculous