Discover/llama: use f16 mask for FA to save VRAM by am17an · Pull Request #23764 · ggml-org/llama.cpp

article

llama: use f16 mask for FA to save VRAM by am17an · Pull Request #23764 · ggml-org/llama.cpp

r/LocalLLaMA · 0 upvotes

Type

article

Stars

Added

May 29, 2026

↗Related Items

article🟤 Reddit

⭐ 50

r/MachineLearning · 0 upvotes

article🟤 Reddit

⭐ 50

r/artificial · 0 upvotes

article🟤 Reddit

⭐ 50

r/artificial · 0 upvotes