Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The disassembly for both versions of the file was a 100% match, though, which is a pretty good indicator that the binary difference must be something unimportant and not related to the actual code.


It's possible that the difference is in the actual code: x86 machine code sometimes has multiple different encodings for the same assembler instruction. These would show up as single-byte differences in the binary. See http://www.strchr.com/machine_code_redundancy for some examples. Compilers might use this to stenographically hide additional data in the binary. Printer manufacturers already do something similar (https://en.wikipedia.org/wiki/Printer_steganography). I wouldn't be surprised if MS compilers embedded a hidden data in binaries -- it could be useful to track down malware authors; or identify software created with pirated MS tooling.


Perhaps this could even be used by an organization to differentiate official binaries of closed-source software from leaked binaries.


If you read the whole article, you'd see that every single bit difference is accounted for; none of the bit differences occur in the actual code.


Yes that is a good step towards fully deterministic build. There are ways to manipulate the PE file though to achieve slightly different behaviour. If you want to be thorough and make sure that your binary was not infected by malware for example then comparing the binaries makes sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: