cover

TurboSparse Limitations: The Impact of 150B Token Recovery Training

5 Mar 2026

While achieving 90% sparsity, TurboSparse models currently utilize 1% of the training tokens used by Llama, with further training expected.

cover

TurboSparse: Faster LLMs via dReLU Activation

5 Mar 2026

Boost LLM speeds by 2–5x with TurboSparse. Use dReLU to reach 90% sparsity in Mistral and Mixtral models without losing performance.

cover

dReLU Sparsification: High-Performance 90% Sparsity for Next-Gen LLMs

4 Mar 2026

Learn how this breakthrough makes large language models (LLMs) more accessible and environmentally friendly.

cover

TurboSparse Mobile: 22x Faster Mixtral Inference on PowerInfer-2

4 Mar 2026

Learn how PowerInfer-2 leverages extreme sparsity for a 22.2x speedup over llama.cpp.

cover

TurboSparse Inference: 4.6x Faster LLM Decoding via Hybrid GPU-CPU Computing

4 Mar 2026

Achieve up to 2.28x speedup on pure CPU and 4.64x in hybrid GPU-CPU environments compared to llama.cpp baselines.

cover

TurboSparse: Elite Inference Speed via dReLU Sparsity

3 Mar 2026

Achieve 2-5x faster LLM decoding on RTX 4090 and mobile devices using TurboSparse. Experience 97% parameter sparsity without performance loss.

cover

TurboSparse Inference Speedup: PowerInfer Integration for Real-Time LLM Decoding

3 Mar 2026

Learn how neuron-level predictor modules and expert routing enable practical inference acceleration for Mixtral-47B.PowerInfer Framework Integration

cover

TurboSparse Efficiency: Achieving 97% Parameter Sparsity in Mixtral-47B

3 Mar 2026

Discover how TurboSparse-Mistral-7B and Mixtral-47B leverage ReLUfication to reach up to 90% neuron inactivity, reducing active parameters to just 3%

cover

TurboSparse-LLM Performance: Outperforming Mixtral and Gemma with Extreme Sparsity

28 Feb 2026

Discover how ReLU-based intrinsic sparsity maintains accuracy with significant FLOPs reduction.