Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's true that semantically git commits store the whole tree but doing that naively would be inefficient. Instead, packfiles will store some objects as deltas which could either result in inconsitencies or noticeable knock-on changes if the original object contents are changed.


While that's true, I'd be very surprised if git delta-compressed the commit objects themselves. Changing a commit to point at a different tree wouldn't impact the delta-compressed packings of any file blobs, it would just change the actual file the commit points to.

For example, suppose you started with a commit graph that looked like this:

    C1 --- C2 --- C3
     \      \      \
      T1     T2     T3
       \      \      \
        F1 - - F2 - - F3
Where C1, C2 and C3 are commits; T1, T2 and T3 are the trees they reference; and F1, F2 and F3 are three versions of a file blob stored delta-compressed in your packfile. Then if you had a malicious version of C2 with the same hash you could replace C2 with a new commit C2' pointing at a new tree T2' with a new file object F2', and nothing would break. The resulting commit graph would look like this, and F1, F2 and F3 would all still be in your packfile delta-compressed and accessible, just with nothing referencing T2/F2:

    C1 --- C2'--- C3
     \      \      \
      T1     T2'    T3
       \      \      \
        F1 - - \  - - F3
                \
                 F2'
Regardless, this is all moot to some extent. The attack most everyone talks about is that if you were in control of a central git repository (for example if you were hosting a mirror of an open source repository), you could give two different versions of that repository to different people without them being able to tell, even if they were checking PGP signatures or referencing specific git hashes. For example you could serve the non-malicious files to human developers, and when a user-agent that looks like a CI/CD pipeline such as Jenkins or the Ubuntu/Debian/RedHat packager's build machine or someething clones the repository to build a specific hash requested by the user, give it a malicious version of the source tree that builds a backdoor into the binaries it creates. In this sort of attack you never have to "change" a git object on someone's machine which is something the git protocol naturally isn't designed to do because it never happens naturally.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: