Huawei’s Zurich Computing Systems Laboratory has released SINQ (Sinkhorn Normalization Quantization), an open-source quantization method that reduces the memory requirements of large language models ...