Fastest Inference for Generative AI