More

ahy1 · on Aug 21, 2017

Actually everyone use hard links. They just don't use multiple hard links to the same file.

ahy1 · on March 27, 2015

Those "few minutes" have lasted for at least an hour now.

ahy1 · on Sept 27, 2014

> This feature is documented under the -f option of the export built-in command. The implementation detail of using an environment variable whose value starts with "() {" and which may contain further commands after the function definition is not documented, but could still be considered a feature.

This undocumented implementation detail is also a limitation on the use of regular environment variables, and should be documented. When reading documentation about a mechanism, I expect that special magical strings which change behaviour of the mechanism are clearly documented. If such documentation had existed, someone might have noticed it and guarded against it.

> Assumedly programs like apache filter out environment variables properly. But unfortunately, in the validation of input data, they fails to validate correctly input data because they don't expect that data starting with "() {" will be interpreted by their bash child processes. If there's a bug, it's not in bash, but in apache and the other internet facing programs that call bash without properly validating and controlling the data they pass to bash.

It isn't easy to validate and control data against an unknown magical feature in one of many possible shells.

> But on the other hand, it is free software and not difficult to check the source to see as the nose in the middle of the face, what is done. When reusing a component with missing specifications and lacking documentation, checking the source of the implementation should be standard procedure, but it has clearly not been done by Apache or DHCP developers.

I think the shell is specified in POSIX/SUS. Checking the source of all possible open-source shells would be a huge job. I don't know how they should check source code of the closed-source shells. I don't blame them for using the environment variables according to available documentation.

Edit: typo

fabulist · on Sept 27, 2014

I agree. This is an interesting idea, so I upvoted the paste. But I don't think this author knows how deeply the bug runs, either; the most recent way to exploit it is to export an environment variable of, say, ls to a bash function. [1]

Usually the amount of toxic environment variables are considered to be finite; PATH, LD_PRELOAD, etc., etc. If the name of any executable on the PATH is dangerous, than the number of toxic environment variables is infinite -- are we to scan the entire PATH for each environment variable to make sure it isn't dangerous? What if the CGI script updates PATH?

There is no way to solve this problem with sanity checks. I've yet to peek at the source, but I'm told this feature is vital to implementing things like backtick operators. I think it is too dangerous however, and I don't want shellshock to become a class of bug rather than an instance of toxic environment variables. We're going to have to rip this feature out and re-implement large portions of functionality.

The author is right that this is a product of bash being written in a more trusting time. This is not the first nor the last time the 1970s security models will come back to bite us.

edit: forgot reference:

[1] http://seclists.org/oss-sec/2014/q3/741

edited to add:

Also, Apache does have a mechanism to filter out toxic environment variables; headers are added as HTTP_HEADER_NAME, because its generally the names of environment variables that allows them to be dangerous and not their content. Executing code as a result of parsing the value of an environment variable with no special meaning is a vulnerability.

vertex-four · on Sept 27, 2014

> But I don't think this author knows how deeply the bug runs, either; the most recent way to exploit it is to export an environment variable of, say, ls to a bash function.

If you can set arbitrary environment variables, you're pwned and have always been pwned. You can set all manner of interesting things, including LD_PRELOAD, to control the execution environment and potentially execute arbitrary code.

EDIT: Putting random data in an environment variable where you pick the name should always be secure, though, which is an assumption that most of *nix makes.

FooBarWidget · on Sept 27, 2014

But the problem with Shellshock isn't random environment variables. It's random environment variable VALUES in well-defined environment variable names. It's pretty well-known that there are certain dangerous environment variables (like PATH, LD_PRELOAD) that should not be blindly set. But CGI only sets CGI environment variables like PATH_INFO, as well as HTTP_. That even these can be dangerous because bash executes code on any* environment variable, is completely unexpected.

pbhjpbhj · on Sept 27, 2014

Is this really a loose typing issue: we give Bash data that should be of type "display text" (a sub-type of string I suppose) and it treats that data as type "executable command" (also a sub-type of string).

Would it be possible to wrap|tag input to bash so that only when a program|script sets the env variable with string that's typed as "executable" does bash even think of exec-ing it. I guess that removes some of the hack-ability and would need major rewriting of bash.

I'm a layman trying to do CS ... what could possibly go wrong!

vertex-four · on Sept 27, 2014

The issue is that the environment isn't a bash-specific thing. Anything can and regularly does set environment variables, and there's no space in there to set a flag for "this is executable" - if it's in the value, anything can set that flag, and the problem here is triggered by programs setting environment variables from external data.

rst · on Sept 27, 2014

Plenty of other shells support backticks without the "export -f" magic. They must, as backticks are mandated POSIX behavior; few support "export -f" at all. (And at least one that did, the old Bell Labs post-v7 "Research Unix" shell, used only environment variables with embedded characters which couldn't easily be created by normal means, to avoid the risk of "magic processing" on things like TERM and HTTP_FOO.)

userbinator · on Sept 27, 2014

used only environment variables with embedded characters which couldn't easily be created by normal means

    SOMEVAR="`cat some_binary_file`"

mzs · on Sept 27, 2014

No not like that, something more like this:

()SOMEVAR@%=...

You will get a parse error. There is little more than [a-zA-Z0-9_] you can use in identifiers (except bash adds a few more, grrrr). You can probably pull it of with /usr/bin/env though.

mzs · on Sept 27, 2014

"the most recent way to exploit it is to export an environment variable of, say, ls to a bash function."

Even before the redhat patch you would need something to set echo=() { ... but how will an attacker do that when they can only set something like HTTP_USER_AGENT=() { ... ? See how overriding a builtin is not and never was a vulnerability?

danielparks · on Sept 27, 2014

I agree with all your points.

I think the real bug is that all this stuff calls out to a shell at all. Sure, it's convenient, but it's basically eval().

cnvogel · on Sept 27, 2014

There are two things to differentiate, in my oppinion.

In most cases, the shell is just used to find programs in the PATH when a C programmer uses system(). And for that case, which is probably 99% of the time when /bin/sh is being invoked, it would make perfect sense to implement this with something that exhibits less attack surface.

Taking the "dhcp-exploit" as an example (set a DHCP option on your server to "(){...}; exploit;"), I think it's less clear: Implementing the functionality of updating configuration files according to the DHCP options sent is a prefecty reasonable place to use a script written in sh/ksh/bash! It's easy to implement by any sysadmin, works very reliably with a little care, and performance-wise it's not critical at all.

And regardless of the language you implement it: There's some place where user-input has to be sanitized, but up to now, it was considered common knowledge that arbitrary data in an environment variable is safe as long as the variables' name adheres to some convention (prefix them all with PROGNAME_...). And bash doesn't respect this convention by looking at variable CONTENT, even though I'm pretty sure that it was already established when the bash-project started... (see, for example, handling of "special" variables like LD_xxx in suid programs or the dynamic linker)

asveikau · on Sept 27, 2014

> when a C programmer uses system()

I said it in another thread but this is almost always a mistake. The execve family is much less ambiguous about what gets passed to the program. Using it avoids this type of bug by not putting the shell where it doesn't need to be.

panzi · on Sept 27, 2014

And it's not limited to C. E.g. I would be in favor to remove os.system from Python (in favor of subprocess.call). The `-syntax (backtick-syntax) in Ruby is particularly evil. It's so convenient because it is so concise, but I guarantee you that it is the source of a lot of vulnerabilities. It should be removed ASAP. I think that's kind of a theme in Ruby: is it convenient? Then put it in. But I would have expected more from Python.

meowface · on Sept 27, 2014

subprocess.call is also vulnerable to this, though. It calls out to bash.

panzi · on Sept 30, 2014

Vulnerable to what? The the environment variable problem? I was talking about program argument parsing. os.system("ls %s" % foo) != subrocess.call(["ls",foo])

meowface · on Oct 1, 2014

Ah, I misunderstood then. I agree with you on that point. I assumed you were talking about "Shellshock".

Redoubts · on Sept 27, 2014

I believe you would need to explicitly pass shell=True for that though.

meowface · on Sept 28, 2014

Nope, it's not necessary. Test it with a vulnerable CGI app and call:

subprocess.call(["date"])

Or if bash is not your default shell:

subprocess.call(["bash", "-c", "date"])

tokenizerrr · on Sept 27, 2014

Did you read the next sentence?

> And for that case, which is probably 99% of the time when /bin/sh is being invoked, it would make perfect sense to implement this with something that exhibits less attack surface.

asveikau · on Sept 27, 2014

I did. I did not find it explicit enough. There was no specific recommendation, for example. Moreover seeing the phrase "when a C programmer uses system()" is pretty jarring. There aren't enough warnings you can add to that to convey how much this gets misused and what a bad idea it usually is.

To me, use of system() is very indicative that you need to find another C programmer. There are few other answers to complete the phrase "when a C programmer uses system()".

cnvogel · on Sept 29, 2014

Well... that's a pretty drastic reasoning, leaving aside all weighting of facts. Does it also apply to a Haskell programmer running System.Process? ;-)

The fact is: system() and all it's relatives (popen comes immediately to mind, there are doubtlessly 100 others) have been used, will be used, by 'incompetent' programmers[+] and as long as no other method is as widely established (and: even taught in introductory textbooks), we better provide a workaround that closes most of the holes.

[+] or just programmers weighting the merits of having a parser supporting variable and home-directory expansion, curtesy of /bin/sh -c right built in, which is completely adequate for many tasks. And yes, I know the limitations of it, and would not use it myself most of the time.

spc476 · on Sept 28, 2014

Yes, it isn't that hard to use exec*() to execute a single program, but it gets rather messy if you want to execute a series of piped commands.

Also another function to worry about is popen().

panzi · on Sept 30, 2014

Look what I made you: https://github.com/panzi/pipes

philh · on Sept 27, 2014

How is it different to set some environment variables and then call out to a shell script, versus to set some environment variables and then call out to a perl script, or a binary compiled from C?

TeMPOraL · on Sept 27, 2014

It's not. It's the "calling out" part that is wrong.

You should never call out to anything by passing untrusted user input directly. Any information that came from the outside must be explicitly passed as data through proper serialization mechanisms.

For instance, you don't piece your SQL queries by concatenating strings. You use an abstraction layers, in which you code the query structure and you pass user input as data. There is this extra step of saying "this is data, not code" that strips the external input from executability.

(for the same reasons, if your templating engine is just concatenating strings and not building the page out of trees, you're doing it wrong, but it's a topic for another day)

It's a problem you get when you believe in "the Unix way" a bit too much. Yes, everything is text, but no, not everything has the same semantics.

philh · on Sept 27, 2014

> You should never call out to anything by passing untrusted user input directly.

So if I call a CGI script with parameters foo=bar, what data should apache pass to the handler, if not something along the lines of the string "foo=bar"? When I pass the header "User-Agent: baz" and the handler asks for the user-agent, what should it be told if not "baz"?

Environment variables are data, not code. When apache executes a cgi script, whether it's C or perl or shell, it makes the user input available as data in defined locations.

There's a bug in bash which causes some of that data to be executed, but there's no way to protect against that class of bug.

This isn't a case of "you should have protected against sql injection attacks". It's a case of: there is a bug in your sql server, such that the query "select from Users where username='rm -rf /'" will execute "rm -rf /"*.

Confusion · on Sept 27, 2014

The point of the OP is that if a program has chosen bash to be handler of untrusted user data, then the program has made the wrong choice, because bash is clearly (hindsight!, I'm not claiming I wouldn't have made the same choice) not designed or that purpose. A handler for untrusted user data should be a program specifically designed for that purpose, which should receive the data directly.

Similarly, if a Ruby or Perl script decides to call out to bash with untrusted user data, it's their mistake to trust bash with it, not bash's mistake that it wasn't designed for that use case.

It's perfectly possible to protect against this attack: don't call a generic program with untrusted user data.

clarry · on Sept 27, 2014

So ruby and perl are specifically designed to be a handler of untrusted data?

How do I know what other programs are designed for such a task? What's a "generic program"? At this day and age, it is expected that pretty much all software ought to be designed with security in mind (not that it always is). Because any piece of "generic software" (or just software) is otherwise going to be exploited. Especially on platform where double-clicking a file is the expected way to open it.

More importantly, the point we are making is that we're not expecting bash to "handle" anything. It gets some data. It's not supposed to do anything with it on its own. Period.

Ogre · on Sept 27, 2014

> So ruby and perl are specifically designed to be a handler of untrusted data?

Perl actually is when used in taint mode. http://perldoc.perl.org/perlsec.html

areyousure · on Sept 27, 2014

Yes and no. You can still unintentionally call out to bash if you, say, protect your PATH:

  $ x='() { :;}; echo vulnerable'  perl -t -le'$ENV{PATH}="/bin";print `:;date`'
  vulnerable
  Sat Sep 27 10:51:12 PDT 2014

scintill76 · on Sept 27, 2014

Yeah, I hope I never have occasion to walk through an undocumented minefield, I mean collection of "features", designed by this person.

I say this without animosity to bash devs. I think some blame can be shared. But putting it all on people you expect to understand under-documented behavior and "implementation details" in every possible version of every possible flavor of /bin/sh is madness.

kcbanner · on Sept 27, 2014

I believe this just applies to bash, not sh?

scintill76 · on Sept 27, 2014

On some systems, they are one and the same. /bin/sh is often symlinked to /bin/bash, which is making this so exploitable. /bin/sh is invoked by system(), popen(), etc., and referenced in script "shebangs" (#!/bin/sh at top), so I meant that nobody necessarily knows what "flavor" of /bin/sh they're going to get.

nailer · on Sept 27, 2014

There are other methods of IPC other than shell variables. The shell is a known insecure environment, which is where there are limits on setuid for shell scripts.

By letting everyone on the Internet set shall variables Apache and whatever DHCPd (ISC?) did something they could have known would have bad consequences whether this feature/bug existed or not.

The only data Apache needs to control is Apaches.

icebraining · on Sept 27, 2014

From what I understand, Apache doesn't send them to bash. It sends the to whatever binary is configured to handle the request (using CGI), which were then calling bash unbeknown to Apache (but implicitly passing the same environment variables).

ygra · on Sept 27, 2014

Lots of functions to start another process start a shell instead and is a command line to be executed, e.g. system or popen. The convenience in that case is that you don't need your own handling of $PATH or wildcards or argument parsing. It's pretty standard on UNIXoid systems.

asveikau · on Sept 27, 2014

> The convenience in that case is that you don't need your own handling of $PATH

You don't with execvp or execlp either.

> or wildcards or argument parsing.

IMO this is of dubious value from, say, a C program. Why "parse" the args? Just generate a list...

ahy1 · on Aug 13, 2014

I doubt DEC had anything to do with GEM. It was a product of Digital Research (same company that gave us CP/M, MP/M and DR-DOS)

glurgh · on Aug 13, 2014

They didn't, I just brainfarted. You know, http://vt100.net/dec/alpha_era_logo_small.png and all that.

ahy1 · on May 11, 2014

I wish for a standard #once directive. It should be very simple to implement, increase preprocessing speed and reduce the size of visually disturbing boilerplate in header files.

ahy1 · on May 9, 2014

A combined REPL and editor for C# looks useful. I will try it next time I have some spare time.

The name is confusing. A C shell already exists (http://en.wikipedia.org/wiki/C_shell). An alternative name could be CSharpShell. Btw, while googling that name, I found CsharpRepl, which seems to be a somewhat similar tool.

fish2000 · on May 9, 2014

These tools straddle some Venn circles, so the developer pitch can be confusing.

For example, I make use of both ipython and bpython, both of which I refer to as either shells or REPLs; though neither program is a proper shell (in the /bin/chsh sense), and though people also call them “interpreters” (technically, the python interpreter still interprets), “environments” (vague) or even “IDEs” (wat?) – the concept behind the tools is popular and well-understood.

Personally I think it’s particularly funny to call these Enhanced REPL Shell Interpreter Environments (or what have you) “IDEs” and lump them in with Eclipse or Visual Studio or those other behemoth coding tools; I like bpython and ipython for the myriad ways they are un-Eclipse-y, and if I did C# I would presumably get into CSharp for the same reasons. All of which, like many bicycle-shed innovations, are mere matters of taste.

ahy1 · on April 29, 2014

> Most C++ code uses switch frequently, usually without taking advantage of fallthrough.

I am not so sure about this. Thinking back on my uses of switch in C and C++, I am not able to remember using switch without taking advantage of fallthrough. Maybe I am just not a typical C++ programmer...

dbaupp · on April 29, 2014

Rust still offers an equivalent to things like

  switch (x) {
  case 0: case 1: foo(); break;
  case 2: case 3: bar(); break;
  default: baz();
  }

in the form of

  match x { 
      0 | 1 => foo(),
      2 | 3 => bar(),
      _ => baz()
  }

In any case, Rust's match is so much more powerful than switch, offering (nested) pattern matching like Haskell's `case`.

DerekL · on April 29, 2014

There's two different kinds of fallthrough in C++. The most common is using the same code for multiple values. Rust already supports this by allowing multiple values and ranges of values for each pattern.

The much more rare use of fallthrough is executing code for one case, and then continuing on to execute the code for the next case. This seems to be much more rare. In fact, in my large Android application, I turned on warnings for this type of fallthrough, and out of hundreds of switch statements, only eight used it, and all but one was a mistake.

ahy1 · on April 4, 2014

So his political view made him unsuitable for a job. I am really surprised about this from a company and a foundation I associated with openness and concern about freedom.

ahy1 · on Feb 10, 2014

You could add the option to filter by programming language, to avoid seeing repositories using languages the potential contributor doesn't know or has no interest in.

droidlabour · on Feb 10, 2014

Yeah that's around my head. Filtering by language will play a pretty vital role.

srajbr · on Feb 10, 2014

We will soon push this changes. Thanks for your suggestion :)

ahy1 · on Feb 7, 2014

> Wow. That's full of falsehoods like these:

> "What happens is that variable i is converted to unsigned integer." No: 'long i' is converted to 'unsigned long'.

Actually, unsigned long is an unsigned integer. He didn't write unsigned int.

> "Usually size_t corresponds with long of given architecture." No: For example, on Win64 size_t is 64 bits whereas long is 32 bits.

"Usually" is the keyword here. He could have said "Usually size_t has at least the same amount of bits as long" and it would be better related to the referred rule.

zurn · on Feb 7, 2014

The "short" long is fantastically tasteless, it's not like source compatiblity with Win16 is has resulted in 64-bit builds of apps appearing.