Document Processing Agent for SharePoint AI people.

Document Processing Agent Preview

Copilot Studio’s Document Processing agent is a new Power Platform offering (currently in preview) that uses generative AI to automate end-to-end document workflows. It is designed as an autonomous agent within Copilot Studio: you install a managed “Document Processor” agent solution and configure it to watch for incoming documents. The agent then orchestrates a series of flows and AI Builder GPT-4 prompts to extract, validate, and export data, with optional human review.

How does it work?

The agent flows work like a state machine. When a new file arrives (for example, an email attachment), a trigger flow captures it, inserts a record into Dataverse with status = New, and notifies the agent to process it.
The agent’s instructions (GPT-based prompts) then invoke AI Builder actions. For extraction, the agent sends the document to a GPT-4 prompt, which “extracts all relevant information ___ as a JSON document”. The JSON is stored in Dataverse, and the status is set to Processed.
Next, for validation, a second GPT-4 prompt checks the JSON against formatting and business rules. If valid, the status becomes Validated; otherwise, it becomes Manual Review. Validated items trigger an export action that moves data into target apps.
If manual review is needed, the agent invokes a Power Apps Canvas “Validation Station” to let a human correct the data, then resumes export once approved. Throughout, the agent can interact by chat (Copilot interface) to accept manual document submissions or answer document-related queries.

Use cases

The Copilot agent is built for scenarios like those Syntex targets, but with a generative twist:

Invoice and receipt processing: Users can send or upload invoices; the agent will auto-extract fields (vendor, amounts, line items) via GPT-4 prompts. Unlike Syntex’s template-based method, no rigid model training is required – GPT handles layout variations in one go.
Contract management: The agent can parse contracts to extract key clauses or metadata (parties, dates, terms) using LLM understanding. It can even answer questions about contract content in Copilot chat. Preconfigured flows and a validation UI enable deployment of an end-to-end contract workflow with minimal low-code effort.
General document workflows: Copilot’s agent can integrate multiple sources (email, SharePoint, OneDrive) and output to various sinks (Dataverse, Dynamics 365, Teams) in one pipeline. It “seamlessly processes invoices, contracts, and receipts” and more. By leveraging AI, it turns traditionally manual tasks into automated ones (categorizing, extracting, and routing documents) without having to build each flow from scratch.

Licensing & prices

Licensing is Credit-Based, Not Per-User: The core agent execution capacity is indeed licensed via Copilot Credits, which are pooled across the entire tenant, not licensed per individual end-user who interacts with the agent. (However, a Copilot Studio User License is required for makers who create and manage the agents).

Acquisition Methods: You acquire Copilot Credits via prepaid packs (or subscription capacity) or through a pay-as-you-go meter (which requires an Azure subscription).

Prepaid Pack Size and Price: The common prepaid subscription pack contains 25,000 Copilot Credits and is priced at approximately $200/pack/month (paid yearly).

There is also a pay-as-you-go option: you can use the agent and pay for the credit you consume each month without an upfront commitment. In short, Copilot’s cost model is usage-based, like Syntex’s new model, but measured in credits. (Note: Microsoft 365 Copilot user licenses (≈$30/user/month) include some limited Copilot Studio capability. Copilot agents require a Power Platform environment with Dataverse (which includes default capacity) and any necessary connectors (e.g., SharePoint, Outlook).

Great feedback came in from LinkedIn about the Credits if you already have a Copilot License and here is my answer

1) Use within M365 Apps: Interactions by a user with an M365 Copilot license are not billed against your organization’s Copilot Credits pool. The user’s M365 Copilot license covers this usage.

2) Use on External Channels (e.g., website, Facebook, WhatsApp): All interactions, regardless of the end-user’s M365 license status, consume Copilot Credits from your prepaid pack or PAYG meter.
In short, you only need to purchase credits (prepaid or PAYG) for external-facing agents or when using certain advanced, high-credit-cost features.

Please take this with a pinch of salt, as it’s in preview, but that was my understanding. Licensing with Power Platform is always tricky.

Advantages

Generative AI extraction: By using GPT-4 via AI Builder, the agent can handle highly variable documents with little setup. You often only need to upload one sample document during configuration – the model learns from the prompt and requires no explicit field labeling. This beats Syntex for unpredictable layouts.

End-to-end automation: The agent is a turnkey solution. It includes prebuilt flows (triggers from email/SharePoint, extraction, validation, export) and a user-friendly “Validation Station” for manual checks. You don’t need to build each step separately.

Broad integration: Because it runs in Copilot Studio, the agent can monitor many inputs (mail, Teams chats, OneDrive, SharePoint) and output to any Dataverse table or system. It also offers a chat interface (Copilot) for on-demand document processing via conversation.

AI Builder + LLM power: Combines Azure’s large language models (via AI Builder GPT prompts) with Power Platform’s reliability. It can, for instance, not only extract values but also validate business logic in context.

Rapid deployment (preview feature): Microsoft provides a configuration wizard – upload a sample, define schema/rules via UI, and the agent is ready.

Drawbacks

Preview limitations: Currently, the Document Processor agent is in public preview and requires a special “preview features” environment. It’s not yet generally available, and features may change. Many features are still not working yet (to be explained later).

New licensing model: Organizations must invest in Copilot Studio capacity (credit packs or meter). Even if you have SharePoint AI Document Processing (former Syntex/Premium), you’ll need Copilot Credits to run this agent. The dual charging (Copilot credits and any AI Builder credit consumption) can be complex.

Requires Power Platform: You need a Dataverse environment and Copilot Studio access. It’s a weight than a simple SharePoint add-on. You also need to ensure Power Platform premium resources (and maybe premium connectors) are licensed.

Less transparent processing: While flexible, GPT-based extraction can sometimes “hallucinate” or miss context if not carefully validated. Reliance on AI also means you pay for each API call (2 credits per call).

Extractors: Based on my testing, extractors cannot be edited; you must rely solely on the fields suggested by the model.

Setup

Go to copilotstudio.microsoft.com, sign in, and select Create. You’ll see the Document Processor (Preview) Managed Solution listed there—click on it to launch the wizard and begin installing the Agent.

On the next page, the wizard prompts you to upload a document for analysis and information extraction. SharePoint AI specialists will notice that this process is quite like the SharePoint Freeform and Layout AI Models.

However, one key limitation I observed is that the Document Processing Agent doesn’t handle OCR very well. I’m not saying it’s not working — but, for example, as shown in the image, the “OK” text wasn’t captured in any extracted column. In the advanced settings, you can modify the extraction instructions, but unfortunately, the OCR still cannot be processed, regardless of the adjustments.

On the next screen, you can define as many rules as necessary to validate the extracted data. You can turn rules on or off, edit existing ones, or create new ones. Since my company is based in the EU, I adjusted the currency format to follow the European number style.

Assign the approver(s) and add a message for the email notification. You can also switch the Approvals Connector directly from this screen.

Finally, the last configuration page in the wizard focuses on setting up the data sources. As of now, only two options are functional:

When a new email arrives in a shared mailbox
When a new email arrives

The SharePoint option (when an item is created) is currently disabled in the wizard and cannot be used. This means that, for the time being, Exchange-based triggers are the only data source options available directly from the wizard. Keep in mind that this may change in future updates as Microsoft expands Copilot Studio’s capabilities.

Once you enter your email and display name, the wizard is complete, and you can start using your agent based on the configuration you set up. Honestly, creating the agent follows a simple “next-next-finish” approach—nothing complex is required. That said, additional options and settings become available once you access the agent pane.

I’m not a Copilot Studio expert and don’t claim to be one, so this guide might not cover every single detail of agent creation and management. However, as a SharePoint AI professional, this is essentially everything you need to know for now. It’s straightforward, simple, and easy to get started!

The first thing to do once your agent is deployed. Testing, of course! 😊

Click the Test button and upload an invoice to see the document being processed. It may take a few seconds, but all the values will be extracted automatically. I was genuinely impressed by the speed, accuracy, and overall quality of the results.

That said, a quick note: I tested this using structured data from a large, multibillion-dollar company like Uber. I haven’t tried it yet with unstructured content, such as a homemade invoice—but that’s a story for another article.

One thing I’ve been curious about is where the data is actually stored. With a SharePoint background, you’d expect all data to live in a SharePoint Document Library—but in Copilot Studio, that’s not the case.

It turns out there’s a table called “Document Processing Agent Document Field” that’s automatically created for you. All the extractors are added as columns in this table. By clicking the “+” sign, you can add existing columns to your view and see the extracted values directly.

This effectively opens a Pandora’s box of possibilities, because every piece of data stored in Dataverse is not siloed—it becomes a fully accessible, structured entity that can serve as a foundation for virtually any other component within the Power Platform ecosystem.

For example:

Power Apps: You can build custom apps that directly read, display, or manipulate the extracted data. The columns representing your extractors in the “Document Processing Agent Document Field” table can be used in galleries, forms, and dashboards.
Power Automate: Flows can be triggered based on changes or additions to these records. For instance, once an invoice is processed and stored in Dataverse, you could automatically route it for approval, notify finance teams, or integrate it with external systems via connectors.
Power BI: The data can be modelled and visualized for analytics. You can generate reports on extraction accuracy, volume of processed documents, or trends over time.
Dataverse Relationships: Since Dataverse supports relationships between tables, you could link your processed documents to customers, projects, or other business entities, enabling richer contextual insights.
AI and Automations: Any AI model or Copilot integration within the platform can leverage this structured data for downstream tasks, like predictive analysis, anomaly detection, or automating repetitive tasks.

In short, once your document data resides in Dataverse, it becomes a centralized, standardized source of truth, ready to power apps, flows, reports, and intelligent automation across the entire Power Platform. This transforms simple document extraction into a strategic enabler for enterprise-wide workflows and analytics.

Teams and SharePoint Integration

Like any other agent, this one can be published across multiple channels—ranging from websites to SharePoint, WhatsApp, Slack, or even Facebook. If you choose to stay within the Microsoft 365 ecosystem, you can publish it to Teams as well. Doing so brings all the agent’s benefits directly into Teams: meeting summaries and transcripts, alerts for open issues or unresolved questions, and enhanced collaboration across your organization.

Once deployed, you can click “See agent in Teams” to open it directly in Teams. In my case, a new chat was created with the agent’s name. When I uploaded an invoice, the extracted values appeared almost instantly; the upload was at 13:15, and the extraction occurred at 13:15 as well. Pretty amazing!

I went further and added a new trigger: When a new channel message is added.

However, no matter what I tried, the flow didn’t accept the uploaded invoice—likely because this feature is still in preview. I’ll update this section of the article if new functionality is released or if my tenant suddenly decides to cooperate 😊.

What Is the Validation Station in Power Apps for the Document Processing Agent

In simple terms, the Validation Station is the human review interface inside Power Apps that lets you check, correct, and confirm the data extracted from documents by an AI Builder Document Processing model (formerly known as Form Processing).

When you upload a document — say, an invoice or a purchase order — the AI model extracts key information like amounts, dates, and reference numbers. But AI isn’t perfect, and that’s where the Validation Station steps in. It allows a human reviewer to:

See the original document (PDF or image)
Review the data the AI extracted
Correct any errors or missing values
Approve or reject the extracted data before it moves further down the pipeline

In other words, it’s the “human-in-the-loop” safety net that ensures data accuracy before automation takes over.

How It Fits into the Document Processing Flow

Here’s what a typical workflow looks like:

Document upload: A user (or Power Automate flow) uploads a file to be processed.
AI extraction: The Document Processing Agent extracts structured data using your AI Builder model.
Validation: The extracted data is passed to the Validation Station screen in Power Apps.
Human review: A person validates or corrects the data as needed.
Data submission: Once validated, the clean data is stored — often in Dataverse, SharePoint, or sent off for approval in another system.

This ensures every piece of data entering your business processes has been verified by either AI or a human reviewer — ideally, both.

Why It Matters

AI models are powerful, but in real-world scenarios — such as poor image quality, handwritten text, or unusual document layouts — they can misread details. The Validation Station bridges that gap by giving humans a friendly, visual interface to fine-tune the AI’s output.

This not only improves data accuracy but also creates a feedback loop. The corrections made in the Validation Station can help retrain or refine your model over time, making it smarter with each iteration.

The Validation Station is in the deployed solution under Apps, and as shown in the following image, you can either approve or reject the outcome.

As far as I understand it today, it only works with emails – even though you deploy it on SharePoint, where all SharePoint users can use Agents natively within SharePoint to boost their productivity. But, as it’s in preview, I think this doesn’t work again.

Conclusion

The Document Processing Agent represents a significant leap forward in Microsoft’s vision for AI-powered document automation within the Power Platform. Even in its preview form, it showcases how far we’ve come from traditional manual extraction to an integrated, intelligent workflow that ties together Copilot Studio, AI Builder, Dataverse, and Power Automate.

Yes, there are still some limitations — OCR support is not optimal, SharePoint triggers aren’t fully functional yet, and the current data sources rely mostly on Exchange. But the foundation is solid, and it’s clear that Microsoft is building toward a future where every business document can be understood, processed, and validated seamlessly across the Microsoft 365 ecosystem.

Once data lands in Dataverse, it becomes so much more than text — it’s a structured, actionable asset. It can power Power Apps dashboards, trigger Power Automate flows, feed Power BI analytics, or even serve as context for Copilot interactions. Add the Validation Station, and you now have a robust “human-in-the-loop” safety net that ensures quality and compliance before automation takes over.

In short, the Document Processing Agent isn’t just another preview feature — it’s a glimpse into the future of enterprise AI within Microsoft 365. For now, it’s simple, intuitive, and already useful. In a few iterations, it might just redefine how organizations handle documents altogether.

So go ahead — test, tweak, and experiment. The magic happens when automation meets accuracy, and the Power Platform is getting remarkably close to perfecting that balance.

Hope this helps. Next would be a comparison between Syntex, AI builder, and DP agent.

Renewed Revolution!

One response to “Document Processing Agent for SharePoint AI people.”

The Future of Intelligent Document Processing in 2026: Autofill vs. AI Builder vs. Document Processor Agent – Gokan's Studio

January 5, 2026

[…] Pre-built orchestration, Validation Station for human review, export actions to ERP. If you want to know more about the Validation Station: Document Processing Agent for SharePoint AI people. – Gokan’s Studio […]

Loading…

Gokan's Studio