A field guide

The AI tools sold for “complex compliance forms” almost never reach the form your compliance team actually has to fill.

The category page for this topic is dominated by tools that draft answers in vendor security questionnaires: SIG, CAIQ, NIST 800-171, modern web RFPs. Those products are real and they are good at the layer they address. But every regulated enterprise I have looked inside has a second, harder layer: the form that lives in Epic, Fiserv, Jack Henry, Guidewire, SAP GUI, an AS/400 emulator, or a twenty-year-old Win32 app the regulator audits directly. Those forms have no API, no DOM, no documented webhook. This page is about how an AI tool actually fills them, and why the question looks completely different once you know which form you mean.

Matthew Diakonov, Written with AI

Published April 27, 202611 min

Two layers, one phrase

Strip the hype out of the category and there are two distinct problems hiding under one phrase.

Layer one is the questionnaire. A prospect sends a SIG. A regulator asks for an attestation in a standardized template. A customer wants a CAIQ filled out before they will sign a master agreement. These artifacts are text-answer-shaped, and a model with access to the right knowledge base can draft excellent answers. Vanta AI, Drata, Sprinto, Inventive AI, Responsive, and Spellbook all live here, and they all do real work. If your bottleneck is “our security team is buried under questionnaires,” one of them probably solves that.

Layer two is the system of record. A claims rep opens Guidewire and types the contents of a first notice of loss into a 14-field form across three tabs. A bank associate opens Fiserv DNA and creates a new commercial account through a 17-screen onboarding flow. A patient access coordinator opens Epic and registers a new patient through a sequence of modals where field B will not save until field A passes a validation rule that depends on the patient class field three modals back. These forms are the ones the compliance audit eventually inspects, because they are where the regulated data lives. They cannot be filled by a tool that only knows how to talk to HTML.

Why layer-one tools cannot fill layer-two forms

The questionnaire AI tools share an architecture: a retrieval layer over a knowledge base, a model that drafts answers, and an HTML or API integration that writes those answers into a SaaS questionnaire portal. That third stage is the one that does not generalize. The screen where a hospital admits a patient, where a bank opens a commercial loan, or where an insurer assigns an adjuster is not a web form. There is nothing for the integration to talk to. Trying to extend a layer-one tool down here is like asking a website screen-scraper to drive a desktop accounting program: it can read the screen, but it cannot reliably do anything with it because the surface it knows is missing.

Mediar runtime

MCP + Terminator

Epic

Cerner

Fiserv

Jack Henry

Guidewire

SAP GUI

Oracle EBS

AS/400 emul.

The fix is not better selectors. It is a different surface entirely: the Windows UI Automation accessibility tree, which is the same tree a screen reader consumes. Every Win32 control that a compliance officer ever audited (Epic, Cerner, Fiserv, Jack Henry, Guidewire, Oracle EBS, SAP GUI, AS/400 emulators) exposes nodes in that tree because that is how Windows assistive technologies have worked since 2005. A form-fill runtime that reads and writes through that tree inherits two decades of plumbing other categories of tool do not get to use.

What it actually takes to fill one field

The smallest unit of work in a system-of-record form-fill is a single field. Trace what happens when the runtime types one value into one control. The whole product is built around the fact that this loop has to be deterministic, observable, and auditable, even when the surrounding form is not.

One field, end to end

The single most important step in that diagram is “Resolve locator,” because that is where the runtime decides which node on the live tree corresponds to the field the recording captured months earlier. Skipping that step (or doing it with image matching or absolute coordinates, the way old-style RPA does) is what makes form-fill brittle in regulated apps that ship a new build every quarter.

The anchor: `type_into_element` and a four-strategy match

The MCP tool that does the actual typing is type_into_element. It is emitted from apps/desktop/src-tauri/src/mcp_converter.rs (lines 2298 and 2460), takes a structured locator and a string, and either calls EditPattern.SetValue on the matching UIA node or, when the node does not implement EditPattern, falls back to dispatching Win32 keystrokes through the accessibility tree. The locator is resolved by a four-strategy cascade in apps/desktop/src-tauri/src/focus_state.rs, which tries, in order: the recorded automation id, the window handle plus bounds, the visible text content, and finally the parent window as a last fallback. The first three strategies absorb the kind of UI tweak that breaks selector-based RPA (a button shifts a row, a panel reorders, a form gains a tab). Only when all four miss does the runtime mark the step for re-recording.

“The production executor crate has zero references to gemini, claude, openai, or any inference library. The model runs once during recording. The runtime is deterministic Rust calling Windows accessibility APIs.”

LLM call sites in crates/executor (verifiable via ripgrep on github.com/mediar-ai/terminator)

That zero is the whole reason this approach works in regulated industries. A compliance officer cannot sign off on a workflow whose action sequence is decided by a frontier model on each run, because two identical inputs can produce two different action sequences. A deterministic runtime that reads its instructions from a TypeScript file checked into source control is the same audit shape as a SQL stored procedure. The reviewer reads the file, the reviewer signs off the file, the runtime executes the file. The model is gone by the time anyone in compliance opens the artifact.

The form-fill loop in numbers

strategies the locator resolver tries before giving up

LLM call sites in the production executor crate

semantic fields recorded per step in the workflow file

0ms

typical UIA walk plus type for one field on modern hardware

Counts from the Mediar monorepo: four strategies in focus_state.rs, zero LLM call sites in crates/executor (run a ripgrep over the crate to verify), eight semantic fields per step in the recording format, and a roughly 50 to 200 millisecond accessibility-tree walk per typed field on a typical Windows session.

Counterargument: when you actually want a layer-one tool

The honest case for a questionnaire-AI product over a system-of- record form-fill product is real and worth naming. If your problem is “a prospect just sent a 380-question SIG and we have to turn it around in five business days,” you do not need Windows automation. You need a model with access to your security knowledge base, an HTML integration into the prospect's questionnaire portal, and a workflow that lets a security analyst review the drafted answers before submission. Vanta AI, Drata, Inventive, Sprinto, and Responsive all do this well. The category they sit in is mature, the answers they produce are accurate when the knowledge base is good, and the time savings are large.

The mistake is buying one of those tools and expecting it to also fill the regulated form in your system of record. It will not, because the surface it integrates against simply does not exist on a Win32 app. The two layers want different products. A reasonable stack is one tool from each layer.

“We had a UiPath build that took five months for the patient registration flow in Epic. The recording approach captured the same flow in a day, and the difference at the field level is whether the runtime is reading an accessibility tree or matching pixels.”

Chai Leong Choi

IT Manager (case study)

Resolution: pick the tool that fits the form

So what is the honest answer to “what AI tool fills complex compliance forms?” It depends entirely on which form you mean. If the form is a questionnaire, a vendor security assessment, or a modern web RFP, pick a knowledge-base-backed answer-drafting product and let an analyst review the output. If the form lives inside Epic, Cerner, Fiserv, Jack Henry, Guidewire, Oracle EBS, SAP GUI, or any other Win32 app the regulator audits directly, pick a desktop automation runtime that talks to Windows UI Automation, and pick one whose runtime is deterministic so the workflow itself is the audit artifact.

A team that ignores the second layer ends up with very fast vendor questionnaire turnaround and a ten-person operations team still double-keying claims, registrations, or onboarding screens because their AI tool literally cannot reach those screens. A team that ignores the first layer ends up with a beautiful Epic automation and a backed-up queue of SIGs. Neither half on its own is a compliance-form strategy. The right answer is one tool from each layer, and the harder question, almost always, is the second one.

See a regulated form filled in your own environment

Bring one form (Epic, Fiserv, Guidewire, SAP GUI, or any Win32 app) and we will record it live, show the TypeScript file the AI emits, and run the deterministic replay against your test environment in the same call.

Frequently asked questions

What kinds of forms does this category mean by 'complex compliance forms'?

Two very different things. The first is the questionnaire layer at the start of a vendor relationship: SIG, CAIQ, NIST 800-171, vendor security questionnaires, RFPs. Those are modern web forms, sometimes Excel attachments, with text answers a model can draft. The second is the regulated data-entry form inside the system of record: a CMS-1500 in Epic, a new-account onboarding screen in Fiserv DNA or Jack Henry SilverLake, a first notice of loss in Guidewire ClaimCenter, a vendor master record in SAP GUI. The same word 'form' covers both, but the tools, the failure modes, and the audit requirements are completely different.

Which of those does Mediar address?

The second one. Mediar is a Windows desktop automation platform. Its job is to fill the form inside the legacy app where the regulated record actually lives. The tooling for security questionnaires (Vanta AI, Drata, Inventive, Responsive, Sprinto) addresses the first layer and does it well. They are complementary, not competitive: the questionnaire bot answers the auditor's request, then the form-fill bot writes the corresponding control evidence into the system of record.

Why is filling a form in Epic harder than filling a Google Form?

Three reasons. First, there is no API for the screen flow: an Epic registration session involves 8 to 15 modal lookup dialogs, validation pop-ups, and field-level autocomplete that the documented APIs do not expose. Second, the controls are Win32 accessibility nodes, not HTML elements: there is no DOM, no querySelector, no MutationObserver. Third, the order matters: field B is gated by a save on field A, and a save can trigger a different popup depending on the patient class. A naive form-fill that ignores this state machine corrupts data in a way the audit team will eventually find.

How does Mediar actually fill a field in a Win32 app?

Through the Windows UI Automation accessibility framework. The runtime emits a single MCP tool call called `type_into_element`, defined in `apps/desktop/src-tauri/src/mcp_converter.rs` (lines 2298 and 2460). That tool takes a structured locator, finds the matching accessibility node in the live tree, and either invokes its EditPattern.SetValue or falls back to a Win32 keystroke simulation if the control does not implement EditPattern. There are no pixel coordinates and no image matching. The same primitive works in Epic, Cerner, Fiserv, Jack Henry, Oracle EBS, SAP GUI, and any AS/400 terminal emulator that exposes a Windows accessibility tree.

What happens when the field moves five pixels because Epic was upgraded?

The runtime walks a four-strategy match cascade in `apps/desktop/src-tauri/src/focus_state.rs`. It tries the recorded automation id first, then the window handle plus bounds, then the visible text content, then the parent window as a last fallback. Three of those four strategies do not depend on absolute position, so a routine UI tweak (a panel reorders, a button shifts down a row, the form gets a new tab) usually resolves through strategies one, two, or three. Only when all four miss does the step fail, at which point the runtime queues that one step for re-recording. None of this involves a model call at runtime.

Is there an LLM picking the next click while the workflow runs?

No, and this is the entire architectural bet. The production executor crate at `crates/executor` in github.com/mediar-ai/terminator has zero references to gemini, claude, openai, or any inference library. A reviewer can verify that with a single ripgrep. The model only runs once, during the offline recording-processing pass, where it reads the captured event stream and writes a TypeScript file. After that file is checked in, the runtime is deterministic. That is what lets a compliance team sign off on the workflow itself, the same way they would sign off on a SQL stored procedure.

What does 'complex' mean in 'complex compliance forms'?

In this category 'complex' usually means three things at once. Conditional fields (a field appears only if the answer to an earlier field is X). Cross-field validation (the form refuses to save until two fields match a regulatory rule). Multi-screen flow (the form is six screens with a save between each, and a back-navigation that loses the work in screen four). The questionnaire-layer AI tools handle the first kind well because LLMs are good at conditional text. The system-of-record forms have all three at once and the third is the one that breaks naive form-fill: an LLM that types into screen four without observing that screen three failed to save will overwrite the half-saved record.

How do compliance teams audit a workflow Mediar produced?

By reading the TypeScript file. Each step of the recorded workflow is emitted as a `createWorkflow` step with the eight-field semantic record (step_title, user_intent, what_was_clicked, what_was_typed, target_element, parent_window, screenshot_id, side_effect_observed). That file is the audit artifact. A reviewer can diff it like any other code, redline a specific step, run the runtime against a test environment, and capture a deterministic trace. If a step fails the four-strategy match, the runtime emits a failure record into the same trace; nothing is silently retried with a different element.

Which questionnaire-layer tools pair well with this kind of system-of-record automation?

Vanta AI, Drata, Sprinto, and Inventive AI all auto-draft answers to security questionnaires from a knowledge base. None of them write into the system of record. So the natural pairing is: questionnaire AI handles the SIG or CAIQ that a prospect sends, system-of-record automation handles the SOX, HIPAA, or NYDFS evidence-trail filings that have to be written back into Epic, Fiserv, or SAP. Treating the two as the same product is the most common category mistake teams make when shopping for this.

Is the form-fill runtime open source?

The execution layer is. The Terminator SDK that performs the UIA calls and the four-strategy element resolution is published as `terminator-rs` on crates.io and lives at github.com/mediar-ai/terminator under MIT. The MCP tools (`type_into_element`, `click_element`, `set_value`, etc.) are documented there. The orchestration layer, the cloud workflow runner, the recording pipeline, and the no-code builder are commercial. A team that wants to wire form-fill primitives into their own queue can build directly on Terminator without paying for the cloud product.

Adjacent reading

Architecture

Where the AI in Mediar AI actually lives (and where it does not)

The model authors a workflow once during recording. The runtime is a Rust binary calling Windows accessibility APIs with zero LLM calls in the hot path.

Read

Glossary

Meaning of robotic process automation, word by word

How the phrase Blue Prism coined in 2003 ended up describing two incompatible architectures, and which one regulated industries can actually sign off on.

Read

Deep dive

Inside the recording-to-replay primitive

The eight-field semantic record per step, the YAML accessibility-tree snapshot, and how a workflow becomes a deterministic TypeScript file.

Read

Two layers, one phrase

Why layer-one tools cannot fill layer-two forms

What it actually takes to fill one field

The anchor: type_into_element and a four-strategy match

The form-fill loop in numbers

Counterargument: when you actually want a layer-one tool

Resolution: pick the tool that fits the form

See a regulated form filled in your own environment

Frequently asked questions

Adjacent reading

Where the AI in Mediar AI actually lives (and where it does not)

Meaning of robotic process automation, word by word

Inside the recording-to-replay primitive

The anchor: `type_into_element` and a four-strategy match