AI makes us code faster
— and makes code harder to trust

GenAI is shifting software development away from mechanical execution toward work that actually requires human judgment: understanding requirements, making architectural decisions, ensuring the code solves the right problem. The developer role moves from writing code to conducting — defining intent, reasoning through complexity, reviewing outcomes.

But there's a catch. The volume of AI-generated code far exceeds what a human can quickly consume. The code works, until it doesn't — and when something breaks, it's much harder to trace why. "Vibe coding" feels productive until you hit a wall you can't debug your way out of.

The deeper issue is structural. Traditional software development is built around object-oriented thinking, where behavior lives inside functions and classes you can point to. GenAI-assisted development doesn't fit that model. Behavior becomes distributed, contextual, and emergent — shaped by the model, the prompt, conversation history, and available tools. A small change in wording leads to a very different outcome.

How might we help teams design and control system behavior?

This framed the entire Kiro product. The answer wasn't better prompts — it was better control surfaces. That meant rethinking workflows entirely, not just the UI.

6 primary iterations over 9 months

We explored a wide range of concepts at different fidelities, working immersively with our engineering team — who are also our users. We prototyped across different interaction models, from traditional chat interfaces to more structured workflow approaches, testing various levels of automation versus user control.

We ran user studies, interviews, and participatory design work sessions with AWS Heroes and the broader developer community. Each iteration incorporated feedback and tightened our understanding of what developers actually needed.

Design iterations across 9 months

From a natural language idea
to production-ready code

Idea → A simple prompt in chat. The agent helps generate a markdown file where the user and agent work together — detailing, refining, iterating — until it's ready to become requirements.

Requirements → The agent generates a technical design covering architecture, data flow, interfaces, data models, error handling, and test strategy. This is the translation layer between what the user wants to build and what the system actually needs to include.

Tasks → An execution plan linked back to the original requirements. This is also the primary interface where the agent surfaces execution status as it works through each step.

Code → With spec in place, the agent produces viable code. The developer then continues with vibe coding — iterating and refining — until the output is robust and production-ready.

Core interaction loop from idea to code

Spec-driven Development

Instead of jumping from idea straight to code, spec-driven development creates a structured middle layer where developers and agents build shared understanding first. Specs externalize intent — making it explicit, structured, inspectable, and shared. Rather than guessing what the AI will do, you define what you want, and the spec becomes the source of truth both parties refer back to. Fewer rewrites, more predictable outcomes, better collaboration.

When we introduced spec-driven development, many users weren't sure when to use it versus just vibe coding. That made the entry point a feature education moment as much as an interaction design one. Going forward, the intent is to move toward intent detection — where the agent recognizes what the user is trying to build and recommends the right approach automatically.

Spec step 1 — select a feature to spec
Spec step 2 — requirements generation
Spec step 3 — technical design
Spec step 4 — task list

Vibe Coding for Refinement

Spec handles complexity. Vibe coding handles everything else. Once you have a structured foundation, you can talk to your codebase naturally — "make this faster," "add error handling here," "what if we tried a different approach?" — without needing a whole plan first.

Supervised mode lets users approve agent-generated code step by step, pausing at the file level before moving forward. Users can also step in anytime mid-execution. The next iteration will pause at logical groups of edits rather than individual files, reducing friction while keeping the human in the loop.

Supervised mode interaction demo

Control surfaces for agentic development

3 key features to guide the AI agent, automate workflows further, provide context more effectively with better context window management.

Kiro steering feature interface for guiding AI behavior with project-specific rules
Steering
Kiro hooks feature interface for automating agent actions based on IDE events
Hooks
Kiro powers feature interface for extending capabilities with MCP servers and documentation
Powers

Steering Rules

Persistent instructions that teach Kiro how your team works. Write your conventions once — "use TypeScript," "follow our naming standards" — and Kiro applies them automatically across every interaction. Consistent code, no repetition, and the same agent behavior for everyone on the team.

Steering rules interface
Agent hooks interface

Agent Hooks

Automated triggers that execute predefined agent actions when certain events occur — a file save, a completed task. Instead of manually asking Kiro to run tests or check for issues, you set it up once and it runs in the background. The repetitive overhead disappears; you just code.

Kiro Powers

Dynamic tool loading through MCP. Too little context and the agent guesses; too much and it slows down. Kiro Powers solve this by connecting to external tools on demand rather than loading everything upfront. One-click install, minimal setup, and an open ecosystem where anyone can build and share tools.

Kiro Powers flow chart
Kiro Powers interface

Why we invested early

Building a design system before the product is fully defined might seem premature, but it was one of the best early investments we made. When you're navigating genuine unknowns, a lightweight system eliminates the micro-decisions — button padding, spacing tokens — so you can focus on the harder questions about how the product actually works.

It also enables parallel work at speed. Once more than one person is designing, a shared system means designers, engineers, researchers, and PMs can all move forward using the same visual language without constant check-ins. And because design tokens map directly to code tokens, engineers could prototype faster and we avoided the typical "rebuild it properly before launch" reckoning.

Kiro post-launch metrics

Spec-driven development adoption and
6-month post-launch metrics

82% of users adopted spec-driven development — significant uptake for a completely new workflow. Satisfaction followed: 83% positive ratings on spec creation, 73% on task creation.

100,000 daily active users is meaningful traction for a new product category. 35,000 paid conversions shows developers finding enough value to pay for more. The number I'm most proud of: 92% positive ratings from power users — the developers who put Kiro to work every day.

Kiro post-launch metrics