This technique (called speculative decoding) has become essential for enterprises trying to reduce inference costs and ...