MiniMax M2.5 hits about 80% on Sweetbench and runs near 100 tokens per second, helping teams deploy faster models on tighter budgets.
Sparse matrix-matrix multiplication (SpMM) is a crucial kernel in various applications, including sparse deep neural networks [1]–[6], graph analytics [7], triangle counting [8], and linear algebra ...
Abstract: Downlink precoding in massive multiple input multiple output (MIMO) systems involves high-dimensional sparse matrix calculations, which poses challenges to existing architectures.
Chinese AI startup MiniMax, headquartered in Shanghai, has sent shockwaves through the AI industry today with the release of ...