fn reduce_wide(wide: &[u64; 8], modulus: &[u64; 4]) -> [u64; 4]
Reduces a 512-bit product modulo p via shift-and-subtract.
p