Overview

Staff-side bulk intake replaces single-document uploads for high-volume scenarios — mailed receipt packets, faxed treatment plans, scanner feeds, and ZIP archives. The feature introduces the InDocumentSet WorkItem as a batch container that orchestrates splitting, OCR, classification, field extraction, and routing to the correct target WorkItem (VcClaim, FrClaim, ExpenseBill, etc.).

Gated by tenant feature flag. Bulk intake is enabled per-tenant via App.InDocumentSet.EnableBulkUpload. When disabled, documents go through the legacy single-file intake path.

End-to-End Flow

From upload to close, every document traverses the orchestrator — per-doc stage status is written to RoutingResultJson so the batch workspace reflects real-time progress. Terminal states are Closed (all docs resolved), Failed (set-level exhaustion), or ReviewRequired (awaiting reviewer action).

Files / Multi-select PDF, TIFF, PNG, DOCX ZIP Archive Server-side extract Scanner Feed Adapter interface Email / Fax Future InDocumentSet created (state: Received) Child InDocuments attached; orchestrator enqueued ORCHESTRATOR PIPELINE Split Separator / AI / Hybrid → PageRange[] OCR Azure DI Text + layout Classify Type + Category Confidence 0–100 Extract QueryFields / Regex / CustomModel / Layered Route DSL lookup + score vs threshold Top score ≥ threshold & unique? (rule: AutoLinkScoreThreshold, RequireUniqueMatchToAutoLink) Auto-linked InDocument → Processed FK set to target WorkItem Review Required Ranked candidates surfaced; reviewer approves / picks alt / manual-links / rejects Failed Stage retries exhausted Reason persisted on set Yes, unique No / ambiguous / low confidence Retry exhausted Set Closed (all docs resolved)
End-to-end: upload source → orchestrator stages → auto-link or review → Set Closed. Dashed line shows failure exit from any stage after retry exhaustion.

What You Can Upload

The Pipeline

Each uploaded set enters a Hangfire-backed orchestrator that drives the set through five stages (Splitting → Classifying → Extracting → Routing, plus the starting Received state). The user sees real-time progress in the batch workspace.

Received → Splitting → Classifying → Extracting → Routing →
  (ReviewRequired | Closed | Failed)
StageWhat It DoesConfigurable
Split Breaks multi-page PDFs into sub-documents. Strategies: Separator Page, AI Boundary Detection, Hybrid (AI first, separator fallback), None. Per-tenant default + per-rule override.
OCR Text extraction via Azure Document Intelligence. Skipped when AI is off. Gated by App.InDocumentOCR feature.
Classify Assigns InDocumentType with a 0–100 confidence. (v1: the AI classifier does not populate InDocumentCategory; reviewers pick the subcategory in the UI when needed.) See the Document Classification page.
Extract Pulls identifier fields (claim #, victim name, DOB, service date, provider TIN). Strategies: QueryFields, CustomModel, Regex, Layered, None. Per-rule ExtractionStrategy + ExtractionConfigJson.
Route Looks up candidate target WorkItems using a DSL query; ranks by score; auto-links only on unambiguous unique match above threshold. Per-rule identifier keys + thresholds + RequireUniqueMatchToAutoLink.

Review Workflow

Documents that don't auto-link land in a reviewer queue with a ranked candidate list. Two layout modes:

Each review surfaces a matched-rule chip so reviewers see which routing rule triggered the extraction. Actions: approve the top candidate, pick a different candidate, manual-link to any WorkItem, reject, or reprocess.

No-Match Behavior

When no candidate passes the threshold the rule's NoMatchBehavior decides what happens:

Reprocess

Any document or whole set can be reprocessed — prior routing results snapshot into an append-only history on RoutingResultJson, so you can see how classification and scoring evolved across attempts. Useful when rules are updated or AI features are toggled.

Reliability & AI Off

Permissions

VBO-only, granular:

Metrics Dashboard

Six widgets at /app/main/vcp/bulk-intake/metrics: