offsite postmortem.

./danielkim.sh

36 hours.0 code written.$10k in prizes.

What we learned trying to win a hackathon during an offsite.

We just wrapped an intense 48-hour hackathon + offsite combo, running Cerebras' in-house AI coding agents and other SOTA coding harnesses on hard, real-world coding problems.

The results speak for themselves: Cerebras employees took 4 out of the top 6 spots, including first and second place, at the GTM Hackathon in Utah.

We dove deep into the developer experience, agent capabilities, and the future of AI coding.

  • 1st Place ($10k Winner): Isaac, Keilyn, and Josh (Isaac's siblings) – Beauti-pool, run physical mail campaigns for pool cleaning services by analyzing geospacial data and finding specific features, like roofs/pools/etc.

  • 2nd Place: Brandon Kang – Cerby, a Tamagotchi-like entity to gamify token consumption and drive retention/usage.

  • Finalist: Sebastian – An AI Voice Assistant designed for bike shop operations.

  • Finalist: Pearl – Sales Saver, a unified dashboard to prioritize sales activities.

Here’s a breakdown of what we learned on the front lines.

1. Developer Experience drives user preferences.

People choose their tools based on "vibes" and onboarding friction.
Not benchmarks.
Most people don't even choose a specific model, they go with the default.

  • The vast majority of non power users don't have bespoke settings. They go with the default settings on Copilot/Cursor
    The tools are approachable and onboarding is straightforward.
    The masses aren't chasing the bleeding edge if it means a steep learning curve.

  • For power users, giving up the visual safety net of an IDE is a massive leap of faith.
    However, once a TUI/CLI proves it can reliably handle long-running reasoning threads and orchestration, its speed and minimalism become an addictive advantage.
    The mental overhead drops, and productivity soars.

  • Version Control should be a first principle in AI Coding.
    During one session, an agent-made change entirely deleted Daniel's entire work, and it was un-recoverable due to a lack of proper source-of-truth tracking for AI edits.

    This experience highlighted a critical gap.

    Because of this experience we've already integrated Git worktree natively into our tools, ensuring we provide a safety net where users can easily navigate, accept, or revert changes.

2. Intelligence Trumps Speed for Complex Tasks

GLM 4.7 is incredibly fast, but on the hardest problems, speed isn't a silver bullet.
It's still a "measure twice, cut once" world.

  • Reasoning > Velocity:
    Models like Claude 3.5 Sonnet often outperform purely on reasoning depth.
    If a model lacks the intuition to solve an edge case, the developer ends up back in the IDE fixing it manually.

  • Plan Mode is a Game Changer:
    Moving from "pray and paste" to a dedicated Plan Mode allows you to stress-test logic before a single line of code is written.

    Using a high-intelligence model to architect the plan, then let fast Cerebras inference one-shot the execution maximizes accuracy while minimizing token waste.

3. Efficient Orchestration is the Future of Autonomy

The ability for agents to tackle complex tasks via sub-agents and sustained reasoning threads is transformative.

  • Harnesses can impact end user experience as significantly as model quality
    While an intelligence gap exists between OSS and frontier models, a robust "harness" bridges it.

    By designing a system that orchestrates tasks and employs a divide-and-conquer strategy, you can get frontier-level results from faster, specialized models.

  • Concurrency is the Ultimate Speed Boost:
    Even with fast GPUs, serial execution is a bottleneck.

    We saw GPU-based assistants finish faster simply because they were optimized to do multiple things at once.

    We need to standardize planning models that kick off parallel threads to see the real impact of fast inference.

4. Strategy Requires Hyper-Focus

Building a tool "for everyone" is a trap.

You cannot cater to non-technical audiences while building high-floor features like agent swarms.

We should strive for specific tools for specific personas.

For example, Claude Code for devs and Claude Cowork for general users.

5. In-Person Work Builds Unbreakable Trust

Collaborating in person reminds you that teammates are more than just avatars on Slack.

Whether it's a 6:00 AM run or late-night karaoke, hacking together in a shared space builds a level of trust that makes working together fun and engaging.

High-intensity environments build high-intensity bonds.

Comments

Loading comments...