Extending Issuer to Pull Requests: Same Pattern, Different Data
After building an LLM-powered issue manager, the natural next step was PRs — syncing, diffing, reviewing, and posting feedback with the same local-first pattern.
After building an LLM-powered issue manager, the natural next step was PRs — syncing, diffing, reviewing, and posting feedback with the same local-first pattern.
A sequel to the issuer architecture post: how SQLite state, config.yaml, kanban workflow, and LLM-assisted judgment made large-scale issue triage safe.
issuer is a local Python CLI that uses SQLite and LLMs to triage GitHub issues — sync, analyze, review, close.
A Ghostty tab started printing raw escape sequences for every key, including mouse moves. Here is what causes it, why iTerm2 never showed it, and a six-line zshrc hook that heals it.
A direct RTX 3090 benchmark comparing my llama.cpp plus llama-swap Qwen3.6-27B setup against the club-3090 vLLM tools-text solution.
A vision-language-model-driven UI test typed my password into the username field. Fixing it took eye-icon prompts, before/after-click diffs, canary-byte typing, and uncovering two latent bugs.
How to fit NVIDIA's Nemotron-3-Nano-30B multimodal model with vision support on a single RTX 3090 — benchmarks, VRAM tricks, and the surprising resolution behavior.
Downloading and benchmarking NVIDIA's Nemotron-3-Nano-30B on a single RTX 3090 — including the confusing discovery about Unsloth Dynamic quants.
Benchmarked Qwen3.6-27B Q4_K_M on an M1 Max via llama-cpp-turboquant. The same prod config that runs on my 3090 was already optimal — but llama-cli's interactive auto-mode nearly ate my disk.
Qwen 3.6 overthinks in free-form mode, wasting thousands of tokens. Constraining thinking with a GBNF grammar reduced think-token consumption by 7x without losing code quality.