Extending Issuer to Pull Requests: Same Pattern, Different Data

After building an LLM-powered issue manager, the natural next step was PRs — syncing, diffing, reviewing, and posting feedback with the same local-first pattern.

May 8, 2026 Tooling, Automation

Running Issuer at Scale: Closing 42 Issues Without Losing Control

A sequel to the issuer architecture post: how SQLite state, config.yaml, kanban workflow, and LLM-assisted judgment made large-scale issue triage safe.

May 7, 2026 Tooling, Automation

issuer: A Local SQLite-Backed GitHub Issue Manager Powered by LLMs

issuer is a local Python CLI that uses SQLite and LLMs to triage GitHub issues — sync, analyze, review, close.

May 7, 2026 Tooling, Automation

When Ghostty starts spitting escape codes at every keypress

A Ghostty tab started printing raw escape sequences for every key, including mouse moves. Here is what causes it, why iTerm2 never showed it, and a six-line zshrc hook that heals it.

May 6, 2026 Tooling, Terminal

Club 3090 vs My Llama Setup

A direct RTX 3090 benchmark comparing my llama.cpp plus llama-swap Qwen3.6-27B setup against the club-3090 vLLM tools-text solution.

May 4, 2026 AI, Local LLMs

Stopping a VLM-Driven Test From Leaking My Password

A vision-language-model-driven UI test typed my password into the username field. Fixing it took eye-icon prompts, before/after-click diffs, canary-byte typing, and uncovering two latent bugs.

May 2, 2026 AI, Tooling

Running NVIDIA Nemotron 30B with Vision on a 24 GB GPU

How to fit NVIDIA's Nemotron-3-Nano-30B multimodal model with vision support on a single RTX 3090 — benchmarks, VRAM tricks, and the surprising resolution behavior.

Apr 29, 2026 AI, Hardware

Nemotron 30B on 24 GB: Benchmarks and a Quantization Quirk

Downloading and benchmarking NVIDIA's Nemotron-3-Nano-30B on a single RTX 3090 — including the confusing discovery about Unsloth Dynamic quants.

Apr 29, 2026 AI, Hardware

Qwen3.6-27B on an M1 Max: when the laptop config matches the 3090

Benchmarked Qwen3.6-27B Q4_K_M on an M1 Max via llama-cpp-turboquant. The same prod config that runs on my 3090 was already optimal — but llama-cli's interactive auto-mode nearly ate my disk.

Apr 29, 2026 AI, Hardware

Taming Qwen Overthinking with GBNF Grammars

Qwen 3.6 overthinks in free-form mode, wasting thousands of tokens. Constraining thinking with a GBNF grammar reduced think-token consumption by 7x without losing code quality.

Apr 28, 2026 AI, Tooling

Extending Issuer to Pull Requests: Same Pattern, Different Data

Running Issuer at Scale: Closing 42 Issues Without Losing Control

issuer: A Local SQLite-Backed GitHub Issue Manager Powered by LLMs

When Ghostty starts spitting escape codes at every keypress

Club 3090 vs My Llama Setup

Stopping a VLM-Driven Test From Leaking My Password

Running NVIDIA Nemotron 30B with Vision on a 24 GB GPU

Nemotron 30B on 24 GB: Benchmarks and a Quantization Quirk

Qwen3.6-27B on an M1 Max: when the laptop config matches the 3090

Taming Qwen Overthinking with GBNF Grammars

Trending Tags