It is November 2018. I was searching for articles on linux security and came across reproducible builds here :- https://www.privateinternetaccess.com/blog/2018/07/reproducible-builds-solving-an-old-open-source-problem-to-improve-security/ “Inconsistent Compiling Environments Lead to Inconsistent Behavior. different compilers will produce different machine code, even from identical source code. These inconsistencies are one of the many sources of unreliable software, because these different compiled applications all have different behavior and different introduced problems. An app that behaves perfectly when compiled one way may not work at all with another compiler. Even worse, because of all of the different combinations of source code, hardware, compiler versions, settings, and other environmental factors, it becomes extremely hard to identify if your software is fundamentally secure.
Is this source code compromised? Another problem introduced by inconsistent software is that it is impossible to tell if the software that is coming out of your compiler app is exactly what the author intended. A malicious compiler app could create machine code that has intentional vulnerabilities and enable surveillance of systems or outright takeover of a secure system. The malicious compiler problem has been one that has been discussed since the 1980’s. Ken Thompson famously made a proof-of-concept compiler hack that was not only itself compromised, but it would compromise every update to itself and be virtually impossible to detect. http://wiki.c2.com/?TheKenThompsonHack The most interesting piece of this proof-of-concept discusses how this wouldn’t even have to be a specific type of compiler. Other pieces of machine code, like bootloaders and the most fundamental software and firmware on a computer, could be compromised to introduce backdoors system-wide and be extremely hard to detect. This is why there are entire movements (CoreBoot and LibreBoot https://www.coreboot.org/) behind moving all of these fundamental pieces to open-source solutions. It removes the most likely place for something like this to hide.”
When I moved on and searched for reproducible builds, most of the hits were from 2017, like this:-https://www.reddit.com/r/linux/comments/6p14q0/debians_archive_is_up_to_94_for_reproducible_build/ and this:-https://news.ycombinator.com/item?id=13690703 leading me to think the initiative had stalled. But then I found the reproducible builds weekly blog:- https://reproducible-builds.org/blog/posts/187/ which is now on week 187 and going strong.
Some of the arguments against reproducible builds go along the lines of “what’s the point if the hardware is compromised?” the next link appears to be addressing some of the hardware issues:- https://puri.sm/learn/intel-me/
On one of the weekly (reproducible-builds.org) posts I found a reference to a blog post: https://puri.sm/posts/protecting-the-digital-supply-chain/ After reading this post, I discovered Purism the company, who are making motherboards with the intel me disabled, using programmable fuses: https://puri.sm/learn/intel-me/
https://www.reddit.com/r/linux/comments/540nmm/making_reproducible_builds/ :- “Nix/Guix have less packages to deal with, and although their infrastructure is already suited to r-b, most of the work for r-b is getting upstream projects to accept patches. Fixing the build environment is not enough although it gets Nix/Guix quite a lot of the way there – (1) we at Debian think this is “cheating”, I can go into this in some more detail but I thought I’d keep this post short and (2) it doesn’t work in all cases – e.g. for a build process that takes 2000 +/- 100 seconds and sphinx/doxygen is run at the end of it, embedding timestamps you can’t control. There is nothing Nix/Guix/anyone else can do about this, except to patch upstream. And most of the work we’re doing at Debian is patching upstream, which will eventually benefit everyone including Nix/Guix. Also the biggest goal for r-b is to actually have people upload attestations that “I built source X to get binary hash Y”. It is only a very marginal security improvement to “allow people theoretically reproduce the binary” because this only gives security to people who actually do rebuilds – but if we assume everyone does this, then we might as well all switch to Gentoo or some source-based distro. No, we need to distribute attestations so that even people who can’t rebuild everything can benefit. I don’t see Nix/Guix working on this; we are building background infrastructure (plus theoretical research) to eventually be able to do this.”
https://www.reddit.com/r/linux/comments/6p14q0/debians_archive_is_up_to_94_for_reproducible_build/ Debian’s Archive Is Up To 94% For Reproducible Build
So, what are the main types of non-reproducible packages in that 6%? I don’t see specifics from a skim of https://reproducible-builds.org or https://wiki.debian.org/ReproducibleBuilds , is it specific groups of packages that aren’t reliably reproducible or is it a global thing where packages just chronically have reproducibility problems every so often?
This seems to be the current list of unreproducible packages in the Debian unstable: https://tests.reproducible-builds.org/debian/unstable/index_dd-list.html
here’s a detailed article on an effort to make a large codebase reproducible http://blog.netbsd.org/tnf/entry/netbsd_fully_reproducible_builds
https://news.ycombinator.com/item?id=14834386%20:- Status update from the Reproducible Builds project (debian.org) 322 points by lamby on July 23, 2017 | hide | past | web | favorite | 86 comments :- “Guix and Nix are input-reproducible. Given the same input description (input being the source files and any dependencies) an output comes out. Builds are then looked up in a build cache based on the hash of al lathe combined inputs. However. The _output_ of Nix artifacts are not reproducible. Running the same input twice will yield a different result. https://www.reddit.com/r/NixOS/comments/2n926h/get_best_of_both_worlds_guix_vs_nixos/ Nix does some tricks to improve output reproducibility like building things in sandboxes with fixed time, and using tarballs without modification dates but output bit-by-bit reproducible is not their goal. They also don’t have the manpower for this. Currently, a build is built by a trusted builderver for which you have the public key. And you look up the built by input hash but have no way to check if the thing the builderver is serving is legit. It’s fully based on trust. However, with debian putting so much effort in reproducible output, Nix can benefit too. In the future, we would like to get rid of the ‘trust-based’ build servers and instead move to a consensus model. Say if 3 servers give the same output hash given an input hash, then we trust that download and avoid a compile from source. If you still don’t trust it, you can build from source yourself and check if the source is trustworthy. Currently, a build is built by a trusted builderver for which you have the public key. And you look up the built by input hash but have no way to check if the thing the builderver is serving is legit. It’s fully based on trust. However, with debian putting so much effort in reproducible output, Nix can benefit too. In the future, we would like to get rid of the ‘trust-based’ build servers and instead move to a consensus model. Say if 3 servers give the same output hash given an input hash, then we trust that download and avoid a compile from source. If you still don’t trust it, you can build from source yourself and check if the source is trustworthy.”
For and against argument for reproducible builds:- https://news.ycombinator.com/item?id=13690703 NetBSD fully reproducible builds (netbsd.org)
“aseipp on Feb 21, 2017 [-]
NetBSD can build the whole OS
from source tree to distribution media with a single command.
This one, at least, can be done in NixOS/Guix once you check out the source – and the Nix package manager can technically be installed on any Linux distro, too (and some other ports to Cygwin/FreeBSD/Mac etc) and run a single command to get the ISO, or any kind of build product you want. The carefully tested and maintained portability/cross-compilation is another thing though: NetBSD has fantastic support here that is not easily replicated without just doing a ton of work. So its universal basically-always-works cross compilation, everywhere – is rather unique here. You can’t build NixOS ISOs natively e.g. Nix-on-Darwin, which is rather unfortunate. ”
https://www.schneier.com/blog/archives/2006/01/countering_trus.html In 2006, Bruce Schneier blogged a pretty good breakdown of a paper by David A. Wheeler https://dwheeler.com/trusting-trust/ on defending against Thompson’s specific example. The paper itself is still paywalled as of this time, to the best of my knowledge. The Wheeler paper is very interesting, but it is focused on the bowels of compiler design, and this question seems more focused on end-user precautions than compiler design or even systems programming. There are generally two ways we understand the risk involved with compiling a specific piece of code:
1. Authenticating the code as a true, untampered-with piece of code written by someone whom we have chosen to trust. 2. Closely examining the content of the code itself, and thoroughly understanding what it does.
The second case–a thorough code audit–is a huge, long, resource-intensive task. It almost never really happens for codebases of nontrivial size, because it is simply too costly. Much more often, we are looking at the first case: trusting the coder, and validating that the code hasn’t been tampered with between the coder and the consumer. Of course, in the end, as others have pointed out, these integrity checks only do one any good if the developers’ infrastructure hasn’t been compromised, if the developer was coding well, and so on.
http://www.linux-mag.com/id/976/ compiling and linking