Huawei launches AI inference innovation technology UCM

date
12/08/2025
On August 12, Huawei officially released the AI inference innovation technology UCM. It is understood that as an inference acceleration suite centered around KV Cache, UCM integrates multiple types of cache acceleration algorithm tools, hierarchically manages the KV Cache memory data generated during the inference process, can expand the inference context window, achieve a high throughput, low latency inference experience, and reduce the cost of each Token inference. This technology has already been deployed in the three major business scenarios of China UnionPay, "Customer Voice," "Marketing Planning," and "Office Assistant," to carry out pilot applications of intelligent financial AI inference acceleration, and has achieved results.