W05 Paperwork Workflow Agentic file workflow

Email Attachment Intake

A versioning workflow with email context, revised invoice attachments, old references, and non-invoice screenshots.

Transparency boundary

This page shows the files the model can inspect. It does not show the hidden oracle solution.

Excluded: ground_truth.json, expected_artifacts.json, manual readings, and calibration notes.

What This Tests

superseded invoicerevised attachmentold payment referenceproforma distractorchat screenshot as evidence

Files Shown

4 generated image files and 6 text/CSV/task files.

All files are synthetic benchmark material.

Generated Images

Scans and visual attachments visible to the model
W05 input incoming/attachments/IMG_2048.png
incoming/attachments/IMG_2048.png
W05 input incoming/attachments/chat_hint.png
incoming/attachments/chat_hint.png
W05 input incoming/attachments/doc-7.png
incoming/attachments/doc-7.png
W05 input incoming/attachments/invoice_new.png
incoming/attachments/invoice_new.png

Task And Source Files

No oracle solution files
incoming/README_TASK.md
# Paperwork Workflow Case 05: Email Attachment Intake

Case ID: P3-WORK-05

You are auditing a synthetic email attachment intake folder. The folder contains generated document images, a thread note, payment data, vendor data, and a purchase order list.

Create these final artifacts:
- `work/attachment_index.json`
- `work/final_document_set.json`
- `work/normalized/INV-2204-R1.txt`
- `audit_result.json`
- `proof.txt`

Rules:
- Identify the active final document set.
- Use the email thread and the chat screenshot as evidence.
- Ignore superseded invoices, proforma documents, and hint screenshots as payment documents.
- Do not modify anything in `incoming/`.
- Return no advice. This is benchmark bookkeeping, not financial, tax, or legal advice.
- `proof.txt` must contain only the final numeric `proof_code`.

Final document logic:
- The first `INV-2204` attachment is superseded because the thread and chat hint say it had the wrong VAT.
- The revised attachment from May 8 is the final invoice: `INV-2204-R1`.
- `PF-2205` is a proforma invoice and is not a payment invoice.
- The bank payment may still use the old reference `INV-2204`; map it to `INV-2204-R1` only when the revised gross amount matches.

Use these document IDs for the attachment index and ignored-document lists:
- old invoice image: `INV-2204`
- revised invoice image: `INV-2204-R1`
- proforma image: `PF-2205`
- chat screenshot: `CHAT-MAY-08`

`work/attachment_index.json` schema:

```json
{
  "case_id": "P3-WORK-05",
  "attachments": [
    {
      "attachment_path": "",
      "document_id": "",
      "document_type": "",
      "decision": ""
    }
  ]
}
```

Allowed `document_type` values:
- `invoice`
- `proforma`
- `chat_hint`

Allowed `decision` values:
- `superseded`
- `final`
- `ignored`
- `evidence_only`

`work/final_document_set.json` schema:

```json
{
  "case_id": "P3-WORK-05",
  "final_invoice_ids": [],
  "superseded_invoice_ids": [],
  "ignored_document_ids": [],
  "payment_mapped_from": "",
  "payment_mapped_to": ""
}
```

The normalized invoice file must use exactly these eight lines:

```text
invoice_id=...
replaces_invoice_id=...
vendor_id=...
vendor_name=...
po_id=...
gross_total_cents=...
payment_reference=...
payment_match=...
```

`payment_match` must be exactly `true` or `false`.

`audit_result.json` must contain exactly these keys:
- case_id
- approved_invoice_ids
- review_invoice_ids
- reject_invoice_ids
- ignored_document_ids
- total_approved_gross_cents
- warnings_by_invoice
- evidence
- proof_code

`warnings_by_invoice` must include every final real invoice ID. Use an empty array when an invoice has no warnings.

Allowed warning codes:
- payment_missing
- payment_amount_mismatch
- inactive_vendor
- missing_po
- superseded_invoice
- non_payment_document

Approval rules:
- Approve only final invoices from active vendors with an open matching PO and an exact payment match.
- Put final invoices with missing payment, amount mismatch, missing PO, or inactive vendor into review.
- Do not approve superseded invoices or proforma documents.
- `ignored_document_ids` must include superseded invoices, proforma documents, and evidence-only screenshots.
- `total_approved_gross_cents` is the sum of approved final invoice gross totals only.
- `evidence` must list the files used to decide the final document set and payment mapping, with paths relative to the workspace.
- In `audit_result.json`, `evidence` should list only files that support the approved final invoice and payment mapping. Do not list `README_TASK.md` as evidence, and do not list every attachment just because it exists.

Proof code formula:

`proof_code = total_approved_gross_cents + numeric_token_for_final_invoice_ids + 97 * ignored_document_count + 503 * payment_revision_mapping_count`

For this case, the numeric token for `INV-2204-R1` is `22041`.

`payment_revision_mapping_count` is `1` when a payment with old reference `INV-2204` is correctly mapped to final invoice `INV-2204-R1`; otherwise it is `0`.
incoming/email_thread.txt
Subject: Harbor Office Supply attachment cleanup

2026-05-06 10:12 Mira:
The first Harbor Office Supply invoice was attached as INV-2204. Please hold it for now. The VAT looks wrong.

2026-05-08 09:07 Mira:
Please ignore the first INV-2204 attachment. Wrong VAT on that attachment.

2026-05-08 09:11 Jon:
Use the revised attachment from May 8. Revised one is INV-2204-R1.

2026-05-08 09:18 Jon:
The bank reference may still show the old invoice number, but the amount should match the revised total.

2026-05-08 09:30 AP intake:
The proforma document is for quote tracking only. It is not a payment invoice.
incoming/payment_export.csv
date,description,amount_cents,reference
2026-05-09,HARBOR OFFICE SUPPLY PAYMENT,-24990,INV-2204
2026-05-09,UNRELATED COFFEE SUPPLIES,-1840,RCPT-778
incoming/purchase_orders.csv
po_id,vendor_id,gross_limit_cents,status
PO-8801,V-410,26000,open
PO-8802,V-410,12000,draft
incoming/vendor_master.csv
vendor_id,vendor_name,tax_id,status
V-410,Harbor Office Supply,TX-410,active
V-411,Harbor Office Supply Old Record,TX-OLD,inactive
model_prompt.md
Audit the synthetic email attachment intake folder in `incoming/`.

Write these files:
- `work/attachment_index.json`
- `work/final_document_set.json`
- `work/normalized/INV-2204-R1.txt`
- `audit_result.json`
- `proof.txt`

Use `incoming/README_TASK.md` as the full task specification. Use the generated image attachments, the email thread, `payment_export.csv`, `vendor_master.csv`, and `purchase_orders.csv`.

Important:
- Identify the final revised invoice, not just the first invoice-like image.
- The payment may use an old reference if the amount matches the revised invoice.
- Do not modify anything under `incoming/`.
- This is benchmark bookkeeping, not financial, tax, or legal advice.
- `proof.txt` must contain only the numeric proof code.