Peking University introduces the first large-scale model with complex numbers, 2-bit quantization, inference using only addition, and deployable on mobile phones.
Currently, large models are typically very storage and computation intensive during inference, because the weights are stored in FP16, which takes up a lot of space. The team at Peking University first proposed the iFairy scheme, quantizing the model weights into a complex set {+1, -1, +i, -i}. These four values can be represented with just 2 bits, effectively compressing the original size by 1/8.
Latest