Stop Crashing and Start Cooking with vLLM on AMD and Lemonade Server
Author(s): Cody Sandahl Originally published on Towards AI. How I Fixed vLLM on Strix Halo and Got 3x Better Batch Throughput with Qwen3.5 I used to wonder why “curiosity killed the cat,” but now I know that those curious cats probably forgot …