llama-swap 2 Taming Qwen Overthinking with GBNF Grammars Apr 28, 2026 RTX 3090 Power Limit: Finding the Sweet Spot for Local LLM Inference Apr 28, 2026