assay-pdf
BetaOpen-source GWG 2022 benchmark kit for PDF preflight engines.
assay-pdf generates a deterministic ~175-file GWG 2022 test corpus, runs it through a uniform harness against callas pdfToolbox, Enfocus PitStop Server, and lintPDF, and scores TP/FP/FN/TN per rule, per variant, per engine. Reproducible markdown / HTML accuracy reports replace the vendor-gated GWG 2015 Compliancy Test Suite.
- 39 rules × 23 GWG 2022 variants → ~175 PDFs (23 positive baselines, 152 negative failure modes)
- Byte-deterministic corpus generation from a seed; SHA-256 verified vendor assets
- Uniform harness runs callas pdfToolbox, Enfocus PitStop Server, and lintPDF
- TP/FP/FN/TN scoring per rule, per variant, per engine
- Every generated PDF verified against PDF/X-4 (ISO 15930-7) with verapdf