AI 14

rocket-cli: a Rocket.Chat MCP server with a local FTS5 brain Jun 10, 2026
Design Fluency Meets the Knowledge Graph May 26, 2026
Turbo3 + MTP: Merging Two llama.cpp Forks May 15, 2026
Qwen 3.6 Dense vs MOE on Local Stack: what MTP actually delivers May 13, 2026
Qwen 3.6 27B with Native MTP on llama.cpp May 13, 2026
Running Qwen 3.6 35B MoE on an RTX 3060 12GB via -ncmoe May 9, 2026
1.5× Faster Agentic Coding with MTP on Qwen 3.6 27B May 8, 2026
Club 3090 vs My Llama Setup May 4, 2026
Stopping a VLM-Driven Test From Leaking My Password May 2, 2026
Running NVIDIA Nemotron 30B with Vision on a 24 GB GPU Apr 29, 2026
Nemotron 30B on 24 GB: Benchmarks and a Quantization Quirk Apr 29, 2026
Qwen3.6-27B on an M1 Max: when the laptop config matches the 3090 Apr 29, 2026
Taming Qwen Overthinking with GBNF Grammars Apr 28, 2026
RTX 3090 Power Limit: Finding the Sweet Spot for Local LLM Inference Apr 28, 2026