I also blog frequently on the Yesod Web Framework blog, as well as the FP Complete blog.

Hackage Security and Stack

February 14, 2017

Back in 2015, there were two proposals made for securing package distribution in Haskell. The Stackage team proposed and implemented a solution using HTTPS and Git, which was then used as the default in Stack. Meanwhile, the Hackage team moved ahead with hackage-security. Over the past few weeks, I've been working on moving Stack over to hackage-security (more on motivation below). The current status of the overall hackage-security roll-out is:

  • Hackage is now providing the relevant data for hackage-security (the 01-index.tar file and signature files)
  • cabal-install will move over to hackage-security in its next release
  • The FP Complete Hackage mirror is using hackage-security (and in particular Herbert's hackage-mirror-tool) to run its S3-backed mirror.
  • On the master branch, Stack defaults to using hackage-security for downloading package metadata. We may even remove support for Git-based indices entirely, but that's a discussion for another day.

One upside to this is more reliable package index download time. We have had complaints from some firewalled users of slow Git clone time, so this is a good thing. We're still planning on maintaining the Git-based package indices for people using them (to my knowledge they are still being used by Nix, and all-cabal-metadata is still used to power a lot of the information on stackage.org).

However, there's one significant downside I've encountered in the current implementation that I want to discuss.

Background

Quick summary of how hackage-security works: there is a 01-index.tar file, the contents of which I'll discuss momentarily. This is the file which is downloaded by Stack/cabal-install when you "update your index." It is signed by a cryptographic algorithm specified within the hackage-security project, and whenever a client does an update, it must verify the signature. In theory, when that signature is verified, we know that the contents of the 01-index.tar file are unmodified.

Within this file are two (relevant) kinds of files: the .cabal files for every upload to Hackage (including revisions), and .json files containing metadata about the package tarballs themselves. Importantly, this includes a SHA256 checksum and the size of the tarball. Using these already-validated-to-be-correct JSON files, we can download and verify a package tarball, even over an insecure connection.

The alternative Git-based approach that the Stackage team proposed has an almost-identical JSON file concept in the all-cabal-hashes repo. Originally, these were generated by downloading tarballs from https://hackage.haskell.org (note the HTTPS). However, a number of months back it became known that the connection between the CDN in front of Hackage and Hackage itself was not TLS-secured, and therefore reliance on HTTPS was not possible. We now rely on the JSON files provided by hackage-security to generate the JSON files used in the Git repo.

The problem

With that background, the bug is easy to describe: sometimes the .json files are missing from the 01-index.tar file. This was originally opened in April 2016 (for Americans: on tax day no less), and then I rediscovered the issue three weeks ago when working on Stack.

Over the weekend, another .json file went missing, resulting in the FP Complete mirror not receiving updates until I manually updated the list of missing index files. Due to the inability to securely generate the .json file in the all-cabal-hashes Git repo without the file existing upstream, that file is now missing in all-cabal-hashes, causing downstream issues to the Nix team.

How it manifests

There are a number of outcomes to be aware of from this issue:

  • The FP Complete mirror, and any other mirror using Herbert's tool, will sometimes stop updating if a new JSON file is missing. This is an annoyance for end users, and a frustration for the mirror maintainers. Fortunately, updating the mirror tool code with the added index isn't too heavy a burden. Unfortunately, due to the lack of HTTPS between Hackage and its CDN, there's no truly secure way to do this update.
  • End users cannot currently use packages securely if they are affected by this bug. You can see the full list at the time of writing this post.
  • Stack has had code in place to reject indices that do not provide complete signature cover for a long while (I think since its initial release). Unfortunately, this code cannot be turned on for hackage-security (which is how I discovered this bug in the first place). We can implement a new functionality with weaker requirements (refuse to download a package that is missing signature information), but ideally we could use the more strict semantics.
  • The Nix team cannot rely on hashes being present in all-cabal-hashes. I can't speak to the Nix team internal processes, and cannot therefore assess how big an impact that is.

Conclusion

Overall, I'm still very happy that we've moved Stack over to hackage-security:

  • It fixed an immediate problem for users behind a firewall, which we otherwise would have needed to work around with new code (downloading a Git repo snapshot). Avoiding writing new code is always a win :).
  • Layering the HTTPS/Git-based security system on top of hackage-security doesn't make things more secure, it just adds two layers for security holes to exist in instead of one. From a security standpoint, if Hackage is providing a security mechanism, it makes sense to leverage it directly. Said another way: if it turns out that hackage-security is completely insecure, our Git-based layer would have been vulnerable anyway since it relied on hackage-security.
  • By moving both Stack and cabal-install over to hackage-security for client access, we'll be able to test that code more thoroughly, hopefully resulting in a more reliable security mechanism for both projects to share (small example of such stress-testing).
  • Stack has always maintained compatibility with some form of non-Git index, so we've always had two code paths for index updates. As hinted at above, this change opens the door to removing the Git-based code path. And removing code is almost as good as avoiding writing new code.
  • I would still feel more comfortable with the security of Hackage if HTTPS was used throughout, if only as a level of sanity in case all else fails. I hope that in the future the connection between Hackage and its CDN switches from insecure to secure. I also hope that cabal-install is still planning on moving over to using HTTPS for its downloads.