In fairness, assuming 4/8 ports in, 4 ports out, operating at some ungodly GHz using a custom GaAs or SiGe chip, and working in gearboxed scrambled space with very clever input mac prefix mapping and output scrambled/gearboxed precomputation, one _could_ do around 8ns optimistically from the start of the first 64b/66b block (the 0x55... preamble) for 10GB. There's some stuff about preamble shrink that makes it wonky, but that is doable. And interestingly enough, for 25G-R, one can comfortably do this in about 4ns. I am not in fact aware of such a beastie existing for 25G, but I have seen 1 or 2 for 10G though.
Surprisingly, if ChatGPT is prompted _juust_ right, it will even give you a good way to do this.
They also do some packet accounting and cute features off the critical path. None of that at 4ns.