Testing and trust

Trust in an evidence tool has to be earned mechanically, not claimed. This page describes how SBOMFlow is tested, what the checks actually verify, how the product behaves when inputs go wrong, and how you can re-run the verification yourself. All figures below were produced by running the listed commands against the current source tree; re-run them to check us.

Current verified figures

CheckResult
Offline test suite (make test)1,300 tests, passing (15 skips are optional-extra paths, reported as skips — never hidden)
Required runtime dependencies0 — standard library only; optional extras are exactly that
Offline demo (make demo)Full evidence pack from a bundled example, no network
Warning catalog178 stable codes, every one documented; the docs page is generated from the catalog and a test fails on drift
Default artifacts per run14, each structurally validated
Clean-environment installWheel builds and installs into a fresh virtual environment with no network and no dependencies, and the installed CLI runs the demo

What the test suite actually covers

  • Parser tests — positive fixtures for every supported manifest, lockfile, and embedded build format, plus negative and malformed fixtures for each.
  • Malformed-input behaviour — broken JSON/YAML/CSV/archives must produce a cataloged warning with the exact path, never a crash and never silence.
  • Resource-bound tests — oversized archive members, deep nesting, and decompression limits are enforced and surfaced as warnings.
  • Identity tests — version epochs, vendor suffixes, prereleases, range specifiers, container digests, and same-name/different-origin components must survive unchanged; a structural-garbage guard is tested against both garbage and legitimate unusual values.
  • Determinism tests — repeat runs under a fixed --as-of must be byte-identical, including the evidence-bundle ZIP.
  • Boundary tests — reviewer status defaults, VEX justification enforcement, self-approval rejection, gate/VEX consistency, and draft watermarks are asserted directly.
  • Integrity tests — the audit-log hash chain, artifact hashes, and cross-artifact references are validated, and corrupted or tampered outputs must fail validation.
  • Error-contract tests — every CLI error code renders with a fix and a docs link, and every code has a matching section in the error reference.
  • Docs and site checks — every relative documentation link must resolve, and the public build is checked against a fail-closed allowlist so internal material cannot be published.

Synthetic stress testing (TestBuz)

Beyond unit tests, SBOMFlow is exercised against TestBuz: a fully synthetic, deterministic corpus of ten distinct connected-device manufacturer archetypes — from a 19-person sensor company to multinational network, maritime, metering, EV-charging, medical, robotics, and industrial estates — with realistic build systems, supplier evidence, release rooms, deliberately malformed files, and planted operational inconsistencies.

Runs against this corpus are checked by an executable acceptance contract (complete workflow coverage across all ten businesses) and a per-fact semantic evaluation that fails on any silent omission, false merge, false positive, missing provenance, or overclaim.

Be clear about what this is and is not:

  • It is synthetic. Every company, product, and file is fictional; sample advisories are labelled non-real.
  • It is not real-customer validation, and it is not proof of conformity or universal format compatibility.
  • Its purpose is reproducible engineering stress testing — large realistic inputs, hostile file shapes, and honest scoring of what was detected, warned, or left to human review.

How SBOMFlow behaves when things go wrong

SituationBehaviour
Malformed input fileCataloged warning with the exact path; the scan continues
Recognized but unsupported evidenceHashed into the artifact manifest and surfaced with a stable warning — visible, never silent
Warnings you cannot accept in CI--strict[=codes] exits 5 after writing all artifacts, so evidence is never lost to a policy
Enforced gate blocks a releaseExit 1 with the exact blocking reason recorded; every artifact is still produced
Output directory corrupted or editedsbomflow validate fails with exit 4 and names the mismatch
Audit log tamperedThe hash chain breaks; validation fails and new decisions are refused
Reviewer tries to approve their own submissionRejected — separation of duties is enforced
Human VEX decisionApplies only with a valid justification and only to its scope; invalid not_affected is downgraded and stays visible
Symlink escaping the scan rootRefused with a warning — outside content is never read into evidence
Optional tool or extra missingExplicit "skipped" note; baseline operation continues

Verify it yourself

All of these run offline from a source checkout:

sh
make test                 # full offline suite
make demo                 # end-to-end offline evidence pack
make wave4-check          # importer/integration gauntlet
make validate-artifacts   # structural validation of generated outputs
make check-docs-links     # every relative docs link resolves
make check-docs-publication  # fail-closed public-content leak check
make install-check        # clean-venv wheel install + installed CLI run

The TestBuz acceptance and semantic evaluation run from the repository with python3 testbuz/run_acceptance.py --output <fresh-dir> and python3 testbuz/run_semantic_evaluation.py --output <fresh-dir>.

What we do not claim

No test suite proves the absence of defects, and a synthetic corpus cannot guarantee every future customer input. SBOMFlow does not claim perfect accuracy, universal format support, or legal conformity — and any claim on this site that cannot be traced to running code and tests is a bug. Real, sanitised customer artifacts remain the standard we validate against next.

Next: capabilities · security & privacy · what's new