Improve Latency and Throughput with Weight-Activation Quantization in FP8