Pre-Silicon Sim Harness in Practice: One Respin Avoided
I run security and deployment at StudioX, which means I spend most of my time with customers who cannot afford to be wrong about where their most valuable data lives. Silicon teams are the sharpest version of that. Their RTL is the company. So when a design lead at a fabless customer — I'll call the program Meridian — asked me to walk through what the Pre-Silicon / Target Sim Harness actually looks like on a normal Tuesday, inside their own perimeter, I said yes on one condition: I'd tell it exactly as it happened, including the parts that are unglamorous. Here it is.
The change that would have slipped
The delta was boring, which is the whole point. A designer on the PCIe controller block added a clock gater to claw back a few milliwatts of leakage before a power review. Two lines of RTL. It passed lint. It passed the nightly smoke regression, because the smoke suite wasn't written to probe the reset-deassertion ordering around that gater. In the old world this change would have ridden the merge train straight toward tape-out, and the divergence it opened — the gated clock changed when a downstream FIFO saw its reset release relative to the model — would have surfaced in the bring-up lab on first silicon. On the C-stepping. Eleven weeks and a mask set later.
Meridian had wired the harness as a StudioX Mission into their merge flow a month earlier. When the designer pushed, the Mission picked up the commit as a plain-language goal — check this change against the target before it's cleared for sign-off — and went to work. Nobody filed a ticket. Nobody booked emulator time by hand. The designer kept working on something else.
The Tuesday, minute by minute
What made it land for the design lead wasn't the verdict. It was watching the Mission reason in the observations rail, live, the same way you'd read a colleague thinking out loud. The Change Triage Agent classified the delta as a reset-domain change touching the PCIe controller. The reasoning core, holding that finding in its working memory, then had the Sandbox Agent stand up a targeted simulation environment for exactly that block — reaching the customer's simulator and job scheduler through instant MCP servers, no integration project, no waiting on the shared emulator queue. The Divergence Agent ran the scoped stimulus and reported back the precise thing it was asked for: three cycles where the modelled reset deassertion and the target's disagreed, each named signal, each cycle number. The History Agent matched it against the customer's own respin knowledge base and surfaced a nearly identical divergence from a prior program that had, in fact, caused a respin. The Policy Agent checked it against sign-off criteria and marked it a blocker.
The moment it mattered: the decision queue
Here is the part I care about most, wearing my security hat. The Mission did not quietly do anything consequential. It doesn't have the authority to, by design. The checking is read-only — it stood up a sandbox and produced a verdict, and that's the boundary. When the verdict came back a blocker, the Mission ended with an approval request, which the platform turned into a decision-queue row and emailed the two named reviewers a magic-link approve/reject URL. The change to sign-off status waited on a human clicking. The verification lead opened the row, read the trace — including the matched historical respin the History Agent had pulled — and rejected the promotion, sending the gater back for a reset-sequencing fix. Total elapsed time from merge to a human holding a grounded verdict: about three hours, most of which was the sandbox actually running.
That decision-queue row is also, not incidentally, an audit artifact. Who approved what, on the basis of which evidence, at which timestamp — it's all there, in the customer's own database, in their own cluster. For a team that has to answer to a quality process and, increasingly, to customers who audit their suppliers' sign-off discipline, that trail is worth as much as the catch itself.
What the numbers said after a quarter
Meridian ran the harness across their active blocks for a quarter and then did the accounting. It caught two divergences that their existing regressions had not been shaped to find — the clock-gater above, and a FIFO-depth change that shifted a backpressure corner. On their own history, at least one of those two would have escaped to silicon. Call it one respin avoided: roughly eleven weeks of schedule and a seven-figure mask-and-NRE line item, against a checking cost measured in engineer-hours per change and a modest slice of simulation capacity. The design lead's own summary was blunter than any slide I'd have made: "It found the one I would have signed off on myself."
Two things made it deployable rather than just clever, and both are squarely my department. First, it lives entirely inside their perimeter: the Mission, the agents, the knowledge bases, the LLM, and every byte of RTL run in the customer's cluster. Nothing about the design crosses a boundary. Second, wiring the toolchain was hours, not a quarter — each simulator and scheduler registered as an MCP server the Sandbox Agent discovers at runtime, so there was no per-tool integration project to fund and staff before the first check could run.
If you're weighing this, read the business case for the leadership framing and how it works for the architecture underneath. But the practitioner's version fits on a sticky note: give the seam between model and silicon an owner that checks every change, watch it reason in the open, and keep a human on the trigger for anything that matters. Meridian did, and a Tuesday that would have detonated a quarter later ended with a rejected merge and a designer back at his desk before lunch.
Discussion
No comments yet — start the conversation.