AI 14
- rocket-cli: a Rocket.Chat MCP server with a local FTS5 brain
- Design Fluency Meets the Knowledge Graph
- Turbo3 + MTP: Merging Two llama.cpp Forks
- Qwen 3.6 Dense vs MOE on Local Stack: what MTP actually delivers
- Qwen 3.6 27B with Native MTP on llama.cpp
- Running Qwen 3.6 35B MoE on an RTX 3060 12GB via -ncmoe
- 1.5× Faster Agentic Coding with MTP on Qwen 3.6 27B
- Club 3090 vs My Llama Setup
- Stopping a VLM-Driven Test From Leaking My Password
- Running NVIDIA Nemotron 30B with Vision on a 24 GB GPU
- Nemotron 30B on 24 GB: Benchmarks and a Quantization Quirk
- Qwen3.6-27B on an M1 Max: when the laptop config matches the 3090
- Taming Qwen Overthinking with GBNF Grammars
- RTX 3090 Power Limit: Finding the Sweet Spot for Local LLM Inference