Deepseek V4 Flash / Personal Archive

The last time I used an open-weight model was back when Deepseek R1 came out. That felt like a pivotal moment in the future of LLMs, where we could have a massive amount of intelligence for a fraction of the price. Since then, however, I haven’t had much time to play with any open weight models, being that I’m occupied and barely able to keep up with the major lab releases (your Geminis, GPTs and Claudes).

However, recently, I had an opportunity to try out the Deepseek V4 Flash model, partially because it has the most generous limits in my OpenCode Go plan. I have some thoughts, but the succinct way to put them is: “This model is really freaking good”.

One common workflow I use often is the plan and implement workflow, I use a “smart” model to plan (GPT-5.5, Opus 4.6) and “fast” model to implement (GPT-5.4-mini, Haiku 4.5). The benefit of this approach is that it allows me to fail fast, and redo my plan if the implementation goes wrong. Recently, I swapped out GPT-5.4-mini for Deepseek V4 Flash in my workflow, and boy, I haven’t seen a difference. If anything, my experience points to this model being faster and better, flying through the very complicated plans in < 2 minutes.

One of my favourite ways to track model capabilities is Simon Willison’s create a pelican riding a bike challenge, and since I haven’t seen him make a blog post about this model yet - I ran my own test, and the results speak for themselves.