Also the build is not deterministic yet, even on his own machine he got some differences in truecrypt.sys between successive builds. By definition the build is not (yet) fully deterministic then:
"Using the same source and same project directory results in the same pattern of difference in the block starting at 0002CBAC, as the pattern shown between my build from the correct project directory and the original file. This means that this difference is a normal result of the compilation process, and can be considered harmless from our point of view"
The disassembly for both versions of the file was a 100% match, though, which is a pretty good indicator that the binary difference must be something unimportant and not related to the actual code.
It's possible that the difference is in the actual code: x86 machine code sometimes has multiple different encodings for the same assembler instruction.
These would show up as single-byte differences in the binary.
See http://www.strchr.com/machine_code_redundancy for some examples.
Compilers might use this to stenographically hide additional data in the binary. Printer manufacturers already do something similar (https://en.wikipedia.org/wiki/Printer_steganography). I wouldn't be surprised if MS compilers embedded a hidden data in binaries -- it could be useful to track down malware authors; or identify software created with pirated MS tooling.
Yes that is a good step towards fully deterministic build.
There are ways to manipulate the PE file though to achieve slightly different behaviour.
If you want to be thorough and make sure that your binary was not infected by malware for example then comparing the binaries makes sense.
I share the author's belief it's probably benign, and on the assumption that the source code contains no crazy obfuscated magic, it'd be pretty hard to actually hide malicious code or behaviour there.
Still if they want to be thorough (and they should), I'd have liked a bit more explanation than waving it away with a "well these bytes seem to change all the time, so that's okay then". That's not really a justification, it's just an explanation. And it's not okay.
One possible explanation is that the compiler and linker don't produce 100.00% same output because they have somewhere uninitialized space which gets written to the disk. That's why disassembly can still match 100.00%: the bytes that differ don't contribute to any observable functionality. If you know C, you can imagine how it happens:
You have somewhere char buff[ 32 ] which you don't set all to zeros, then you do strcpy( buff, "something" ) then you write all 32 bytes of buff to the file. What's behind "something"? It can be different in every run.
See the "RSDS"? It's some initial value, just like c:\truec... that follows some bytes later. And what's behind? Some small sequence of random bytes. I've just checked and confirmed that it appears in the ".rdata" (read only data) section of the executable.
Now the fact that this doesn't happen on much more places also means that somebody in Microsoft obviously from time to time does fully clean the code, that is, somebody worries about such effects too, but it seems that from time to time some "late" fix then slips through, misbehaving.
Another explanation would be that there is something in the building code or script that produces some different values in the initialization area. That can be then observed by source inspection.
The third explanation would be some kind of "unique id" generated by the compiler or linker. Then this effect should be observed in all binaries even when the source is fully different (e.g. compile some program which generates Vogon poetry, observe the same effect). This hypothesis doesn't match the observations, according to the pictures presented.
So I believe the highest possibility is the first assumption being true.
It seems like if you can have the tools alter the embedded timestamp (even as a post-processing step), you can match the binaries pretty closely.
The last bit is the signature, which you can't duplicate, but you can also just take out, as it's not code. But, if you're paranoid about that, zero-out or remove the signature after verifying it.
"Using the same source and same project directory results in the same pattern of difference in the block starting at 0002CBAC, as the pattern shown between my build from the correct project directory and the original file. This means that this difference is a normal result of the compilation process, and can be considered harmless from our point of view"