Roadmap¶
A running inventory of what the workspace covers today, gaps surfaced during use, and the near-term plan. Pre-1.0, no deployed users; every item is a judgment call about priority, not a commitment.
Current state¶
Workspace surface¶
| Package | Role |
|---|---|
dhis2w-client |
Async HTTP client, pluggable auth, typed responses via generated models. Retry policy, task awaiter, connection-pool tuning all first-class. |
dhis2w-codegen |
/api/schemas → pydantic emitter + OAS spec-patches framework (synthesises Jackson discriminators upstream DHIS2 omits) |
dhis2w-core |
Plugin runtime + shared services (profiles, CLI errors, task watch, client context) |
dhis2w-cli |
Typer root that discovers every plugin (first-party + entry-point) |
dhis2w-mcp |
FastMCP server that mounts the same plugins |
dhis2w-browser |
Playwright session helpers (auth through the DHIS2 login form) |
CLI surface¶
Sixteen top-level domains: analytics, apps, browser, data, dev, doctor, files, maintenance, messaging, metadata, profile, route, system, user, user-group, user-role. Each plugin shares a service.py between the CLI and MCP sides; the same typed call from both surfaces.
dhis2 metadata has the full workflow surface:
- Core CRUD:
list/get/patch(RFC 6902) /rename(bulk name/shortName/description add + strip prefix/suffix,--dry-run) /retag(bulk ref-field + enum rewrites: categoryCombo, optionSet, legendSets, aggregationType, domainType) /share(bulk apply one sharing block to many UIDs, with--public-access/--user-access UID:access/--user-group-access UID:access, stdin UID input via-,--dry-run) /merge-bundle(import a saved JSON bundle file into a target profile — sibling to the source-profilemergeverb). - Cross-resource search:
dhis2 metadata search <query>— fans out three concurrent/api/metadata?filter=<field>:ilike:<q>calls (id,code,name) and merges by UID. Full UID, partial UID, business code, or name fragment all flow through one verb. - Bundle operations:
export/import/diff(file-vs-file and file-vs-live) with per-resource filters + dangling-reference warning on export;diff-profilesfor staging-vs-prod drift. - Authoring sub-apps:
options get / find / syncfor OptionSet sync;attribute get / set / delete / findfor cross-resource AttributeValue workflows;program-rule get / vars-for / validate-expression / where-de-is-used;sql-view list / get / execute / refresh / adhoc;viz list / get / create / clone / delete;dashboard list / get / add-item / remove-item;map list / get / create / clone / delete;legend-sets list / get / create / clone / delete; four fullX / XGroup / XGroupSetauthoring triples with canonical DHIS2 naming —organisation-units/organisation-unit-groups/organisation-unit-group-sets(plusorganisation-unit-levelsfor per-depth rename),data-elements/data-element-groups/data-element-group-sets,indicators/indicator-groups/indicator-group-sets, andcategory-options/category-option-groups/category-option-group-sets; plus theprogram-indicators+program-indicator-groupspair (DHIS2 has noprogramIndicatorGroupSet). Aggregate data-set surface:data-sets list / get / create / add-element / remove-element / delete+sections list / get / create / add-element / remove-element / reorder / delete. Authoring flip side of maintenance runs:validation-rules {list,show,create,delete}+validation-rule-groups+predictors {list,show,create,delete}+predictor-groups. Tracker-schema authoring complete end-to-end:tracked-entity-attributes+tracked-entity-types(with TETA linkage) +programs {list,show,create,rename,add-attribute,remove-attribute,add-to-ou,remove-from-ou,delete}+program-stages {list,show,create,rename,add-element,remove-element,reorder,delete}. Category-dimension authoring complete end-to-end:categories {list,show,create,rename,add-option,remove-option,delete}+category-combos {list,show,create,rename,add-category,remove-category,wait-for-cocs,delete,build}(thebuildverb is the one-pass create-or-reuse helper for the full stack, fed a JSONCategoryComboBuildSpec) + read-onlycategory-option-combos {list,show,list-for-combo}.
dhis2 doctor runs ~100 checks on a live instance (20 metadata-health probes + 81 DHIS2 integrity checks + BUGS tripwires).
MCP surface¶
334 tools across 13 plugin groups (analytics_*, apps_*, customize_*, data_*, doctor_*, files_*, maintenance_*, messaging_*, metadata_* (230), profile_*, route_*, system_*, user_*). Every CLI command has an MCP tool equivalent and vice versa; both share one typed service call.
Typed models shipped¶
Via /api/schemas codegen (generated/v{40,41,42,43,44}/schemas/):
- 100+ metadata resources (DataElement, DataSet, OrganisationUnit, Indicator, Program, …) with full CRUD accessors including the RFC 6902
patch(uid, ops)method - 77+
StrEnums for CONSTANT properties (ValueType, AggregationType, DataElementDomain, …) - A shared
Referencewith bothidandcodefields
Via /api/openapi.json codegen (generated/v{N}/oas/, currently populated on v42 + v43):
- Every
components/schemasentry — 562 classes + 260 StrEnums + 104 aliases on v42; 984 classes on v43. - Consumers in
dhis2w-client:envelopes.py,auth_schemes.py,aggregate.py,system.py,maintenance.py, andgenerated/v42/tracker.pyare all thin shims over the OAS output. - Emitter is deterministic + version-scoped;
dhis2 dev codegen oas-rebuild --version v{N}regenerates from the committedopenapi.jsonwithout network. - Spec-patches framework for known-upstream OAS gaps (
dhis2w_codegen.spec_patches). Each patch is idempotent + carries abugs_refpointer; the rebuild log names which gap was worked around. Current patches:*AuthSchemediscriminators (BUGS.md #14 — still unfixed in v43).
Remaining hand-written in dhis2w-client (by design):
WebMessageResponsesubclass +DataIntegrityReport/DataIntegrityResult/Me/Notification— helper methods and client-side convenience shapes that aren't in OpenAPI.AnalyticsMetaData— typed parser helper overGrid.metaData(a baredict[str, Any]on the wire).Grid/GridHeadercome straight from the OAS codegen.TrackerBundle— thePOST /api/trackerenvelope isn't in OpenAPI under that name. Thin wrapper on OAS tracker models.PeriodType+RelativePeriodStrEnums (24 period frequencies + 45 rolling windows; upstream Java enums the OpenAPI schema doesn't expose — see BUGS.md #28).
Typing posture¶
The four-PR typing sweep (#71-#74) plus the codegen discriminator synthesis (#76) eliminated every dict[str, Any] signature that crosses module boundaries outside the explicit HTTP-boundary carveouts. Every service-layer function returns a typed pydantic model; MCP tools dump at the edge via _dump_model; CLI handlers dump for JSON output or Rich tables. The CLAUDE.md "no dict[str, Any] across module boundaries" rule is enforced workspace-wide.
Runtime features¶
--profile/-pglobal override +~/.config/dhis2/profiles.tomlor./.dhis2/profiles.tomlauto-discovery--debug/-dglobal flag → stderr HTTP trace lines viadhis2w_client.httplogger--watch/-won job-kicking commands (analytics refresh,maintenance dataintegrity run) + standalonemaintenance task watchwith Rich progress UI--jsonopt-in on every write command; concise one-line summary by default- Typed
Dhis2ApiError.web_messageparses the envelope on 4xx so the CLI surfacesconflicts[]/importCount/rejectedIndexes[]detail - Client-side UID generation (
generate_uid,generate_uids); no/api/system/idround-trip - External plugin loading via
importlib.metadata.entry_points(group="dhis2.plugins")— seeexamples/plugin-external/for a minimal runnable reference - Retry policy with exponential backoff + jitter +
Retry-Afterheader honouring. Idempotent-only by default; opt in for POST/PATCH per policy. Threads throughDhis2Client(retry_policy=...)andopen_client(profile, retry_policy=...). - Library-level task awaiter —
client.tasks.await_completion(task_ref)blocks until DHIS2 reportscompleted=True;iter_notificationsfor streaming renderers. - Connection-pool tuning —
Dhis2Client(http_limits=httpx.Limits(...))/open_client(profile, http_limits=...)for sizing against the real DHIS2 capacity. - Data-integrity streaming iterator —
client.maintenance.iter_integrity_issues(...)yieldsIntegrityIssueRows (issue + owning check's name / displayName / severity) as a flat stream. - Files plugin —
dhis2 files documents {list,get,upload,upload-url,download,delete}+dhis2 files resources {upload,get,download}. - System metadata cache — TTL-bounded per-client in-memory cache on
client.systemforinfo()/default_category_combo_uid()/setting(key). 300 s default TTL. - Bulk delete on
client.metadata—delete_bulk(resource_type, [uids])+delete_bulk_multi({...})wrapPOST /api/metadata?importStrategy=DELETE. client.metadata.search— cross-resource UID / code / name search; three concurrent/api/metadata?filter=<field>:ilike:<q>calls merged client-side with UID dedup. TypedSearchResults(query, hits: {resource: [SearchHit, ...]}, total).client.visualizations—VisualizationSpectyped builder (chart type, data elements, indicators, periods, relative periods, legend set, placement overrides) +create_from_spec / clone / list / deleteaccessor.RelativePeriodStrEnum covers the 45 rolling windows upstream OpenAPI exposes as boolean flags.client.maps—MapSpec+MapLayerSpectyped builder withindicators,legend_set, thematic / boundary / facility layer kinds; parallels the viz accessor.client.dashboards—DashboardSlot+add_item/remove_itemon the dashboards accessor, no round-trip of the whole dashboard.- Streaming data-value-set import —
client.data_values.stream(source, content_type=...)feeds httpx's chunked transfer directly from aPath,bytes, sync / async iterable, or async generator. JSON / XML / CSV / ADX. - Streaming analytics export —
client.analytics.stream_to(destination, *, params, endpoint="/api/analytics.json")pipes httpx's chunked response straight to disk viaaiter_bytes. - Multi-instance metadata diff —
dhis2 metadata diff-profiles <a> <b> -r <resource>exports two registered profiles concurrently and diffs them structurally.
Seed fixture¶
The committed e2e dump (infra/v42/dump.sql.gz) mirrors DHIS2 Play's Sierra Leone immunization demo with workspace-local additions: 1332 org units with GeoJSON geometries, 67 data elements, 3 indicators, 3 programs (Child Programme + Antenatal = tracker; Supervision visit = event), 2 datasets, 3 dashboards, 23 visualizations built programmatically via VisualizationSpec + 1 EventVisualization for the supervision program attached to the Immunization data dashboard, 8 maps built via MapSpec, 188k aggregate data values, 500 tracker entities, 12 sample supervision events covering 2024 monthly, 6 program rules + 10 program indicators. Workspace fixtures layered on top (infra/scripts/seed/workspace_fixtures.py): SNOMED_CODE attribute, VACCINE_TYPE option set with 5 fixed-UID options, 3 SqlViews (VIEW / QUERY / MATERIALIZED_VIEW), 2 BCG predictors + PredictorGroup + 2 output DEs, 2 BCG validation rules + ValidationRuleGroup, 4 named OrganisationUnitLevel records (Country / Province / District / Facility), 1 LegendSet (LsDoseBand1) attached to the Measles + Penta-1 monthly column charts. make refresh-and-verify wipes the stack, rebuilds the dump, runs every non-interactive example, and reports a pass/fail summary — 125 pass, 0 fail, 11 skipped (OIDC-login flows including the Playwright-driven variant, browser screenshot captures, long-running analytics jobs, outlier detection, external httpbin dep) on the current main.
CI¶
.github/workflows/ci.ymlrunsmake lint && make test && make docs-buildon every PR.github/workflows/e2e.ymlnightly — full DHIS2 stack + seeded fixtures + slow integration tests
Public distribution (PyPI, tagged releases, CHANGELOG.md) is explicitly out of scope — this workspace is internal-only.
Docs¶
- Auto-generated CLI reference (
docs/cli-reference.md, ~3700 lines from the Typer app) + MCP reference (docs/mcp-reference.md, 313 tools across 13 groups from the FastMCP server). Both regenerated on everymake docs-build. - Narrative tutorials:
docs/guides/cli-tutorial.md,docs/guides/client-tutorial.md,docs/guides/visualizations.md(step-by-step viz + dashboard composition). - Examples index (
docs/examples.md) catalogues 167 runnable examples (74 client, 54 CLI, 39 MCP) with descriptions + cross-links to concept docs. Tracker-schema authoring examples (steps 1 / 2 / 3 underexamples/cli/tracker_*.sh) round-trip the full chain end-to-end. - Architecture docs cover every plugin, the client, auth, profiles, codegen, typed schemas, plugins runtime, external plugins, MCP, versioning, browser automation.
BUGS.md— 29 upstream DHIS2 quirks with livecurlrepros + v43 re-audit status.
Test coverage¶
872 tests run via make test (903 collected including 31 slow-marked for the nightly integration stack). Unit + CliRunner + respx-mocked HTTP. Slow tests exercise live-stack workflows (--watch job polling, Playwright PAT creation, dashboard screenshot capture, Playwright-driven OIDC login). make coverage runs branch-coverage locally + on every CI run (produces coverage.xml as an artifact), fails CI if the run drops under 70% (current baseline 73%).
Detailed test gaps + the planned next moves are in Testing roadmap below.
Upstream quirks tracked¶
38 entries in the repo-root BUGS.md. Recent additions cover the seed / workflow cycle: DataSet Hibernate flush ordering (#23), Person-TET built-in name collisions (#24), /api/.../metadata leaking computed fields (#25), admin OU scope cached per session (#26), fresh-install flakiness on first metadata import (#27), RelativePeriods OAS schema shape (#28), /api/metadata ignoring rootJunction (#29 — the reason metadata search has to fan out N requests instead of one), App Hub versions[*].created returning epoch-millis ints instead of ISO-8601 strings (#30), and the predictor-expression parser rejecting uppercase aggregators (#31 — forces avg() / sum() lowercase even though DHIS2 docs use uppercase).
Gaps surfaced during use¶
Authoring surfaces¶
The organisation-unit PR (#174) set a template — canonical DHIS2 resource names, hand-written accessors, per-item membership shortcuts, no *Spec. The triples sweep (#174 / #175 / #176 / #180 / #181), aggregate data-set surface (#185), validation-rule + predictor CRUD (#186), the full tracker-schema stretch (TET + TEA #188, Program + PTEA + OU #189, ProgramStage + PSDE #194), and the category-dimension stack (Category #205, CategoryCombo + read-only CategoryOptionCombo #208, the one-pass CategoryComboBuilder helper #209) have all landed on top of it. No metadata-authoring gaps remain on the main workflow paths.
Optional ProgramStageSection grouping (rarely used in practice) is still unauthored; reach for metadata patch for it. That's the only known absence and it stays parked unless a concrete caller surfaces.
OIDC / OAuth2 polish¶
- Token refresh is tested in code but undocumented for end users.
Local OIDClogin-page button is non-functional for browser clicks (CLI-onlyredirect_url); no per-provider "hide from login UI" flag in DHIS2 v42 — documented indocs/architecture/auth.md.- Bearer-to-JSESSIONID path for browser workflows on OIDC profiles is unverified (flagged in
authenticated_sessiondocstring).
Near-term plan (next 3–5 PRs)¶
Latest cycle closed the category-dimension strategic option (Category #205, CategoryCombo + read-only CategoryOptionCombo #208, the one-pass CategoryComboBuilder create-or-reuse helper #209) plus the smaller metadata merge-bundle verb (#206). With every authoring path on the main workflow now covered, the codegen emitters fully regen-stable, and bulk verbs (rename / retag / share) shipped on top of patch_bulk / apply_sharing_bulk, the obvious tactical sweep is complete.
The near-term slate is once again open. One item has been carried over without action across the last few cycles: a multi-version CI integration matrix (pure YAML / docker-compose, no Python). Deferred until it becomes the highest-leverage thing to do. The *Spec-class audit also previously sat here; it is now resolved (keep VisualizationSpec / MapSpec + MapLayerSpec / LegendSetSpec + LegendSpec; the rule for when a spec is justified is documented on api/legend-sets.md).
The natural next direction is one of:
- Pick one of the two remaining strategic options below and commit to a multi-PR body of work (data approval workflow, or audit log reader).
- Promote a medium-term tactical item (CLI startup latency, property-based DSL tests) for a focused 1-PR cycle.
Carried over:
- Multi-version CI integration matrix — slow-marked tests run against v42 only; committed codegen for v40 / v41 / v43 / v44 is never exercised against a live stack. Stand up compose-managed stacks in the nightly workflow (
e2e.yml) and run the slow suite against each. Catches codegen drift before it reaches users. Pure YAML + docker-compose work; no Python changes.
Demoted / parked:
apps snapshotexample + CI hook — the feature works, just therestore --dry-rundemo still isn't inexamples/cli/apps.sh. Low value without an active need.ProgramStageSectiongrouping — rarely used in practice;metadata patchcovers the occasional need. Promote if a concrete caller surfaces.
BUGS.md #15 (undiscriminated JobConfiguration.jobParameters + WebMessage.response unions) stays off the near-term list: the sibling-field discriminator pattern doesn't fit the AuthScheme-style spec-patches approach, and the scheduler plugin isn't an active workflow. Revisit when someone hits a real-world need.
Strategic options (pick one before the next cycle)¶
Two independent directions — the right order depends on where the pain is. Each would be a multi-PR body of work.
1. Data approval workflow plugin¶
/api/dataApprovals + /api/dataApprovalLevels + /api/dataApprovalWorkflows cover multi-level aggregate approval (district → zone → ministry sign-off). Common in humanitarian + government reporting pipelines. Surface:
dhis2 dataapproval status <ds> <pe> <ou>— which level is this cell at?APPROVED_HERE / APPROVED_ABOVE / UNAPPROVED_READY / UNAPPROVED_WAITING.dhis2 dataapproval approve / unapprove / accept / unaccept— the four write verbs.dhis2 dataapproval bulk-status <ds> <pe>— every org unit for one dataset-period, exit-on-incomplete mode for CI.- Typed
DataApprovalStatusenum + level-aware state machine.
2. Audit log reader¶
DHIS2's /api/audits/* endpoints track every write by user / timestamp / entity-uid (for DE values, tracker payloads, metadata changes). No wrapper today; integrations that need a "who changed X and when" history have to hand-build URLs.
dhis2 audit data-values --de <uid> [--ou <uid>] [--pe <pe>]— stream every change for a cell.dhis2 audit metadata --klass DataElement --uid <uid>— metadata edit history for one resource.dhis2 audit tracker-entity <uid>— tracker write audit.client.audit.iter_data_values(...)/iter_metadata(...)async iterators for library callers.
Niche but valuable for compliance + forensics use cases.
Medium-term¶
- Multi-version CI matrix — integration tests run against v42 only. Stand up v40 / v41 / v43 / v44 nightly jobs against compose-managed stacks so codegen drift gets caught before release.
- CLI startup latency —
dhis2 --helptakes ~2s to render, which is noticeable on every shell invocation. An explicit attempt in aperf/lazy-oas-initbranch proved where the cost sits + why a clean fix is harder than it looked:python -X importtimeshows ~900 ms goes into pydanticmodel_rebuildacross the 562 OAS-emitted classes, triggered by 16 callers indhis2w-client/*.pydoingfrom dhis2w_client.generated.v42.oas import Xat module top. A lazy PEP 562__getattr__onoas/__init__.pyis correct in isolation but every caller that imports a class by thefrom oas import Xform still triggers the full load. A templates-only change saves ~80 ms; converting every caller tofrom oas.<submodule> import Xsaves ~1 s but breaks FastMCP'slist[Model]tool-return serialisation (MockValSerinstead ofSchemaSerializer— pydantic's forward-ref resolver can't find siblings when only one submodule has been loaded, and skipping the eagermodel_rebuild()loop leaves the schemas deferred). Branch was closed rather than merged. A proper fix needs: (a) per-class on-demandmodel_rebuild()when the class is first touched for validate/serialize, (b) a namespace provider pydantic can consult lazily (maybe via_types_namespace+ a dict proxy that imports submodules on-demand), or © accept that OAS load is unavoidable on any real command and focus on lazy plugin discovery + the Typer app-construction cost instead. Target:dhis2 --helpunder 400 ms. - Property-based testing on filter / order DSL parsing.
Long-term / exploratory¶
- Further
dhis2w-browserworkflows, layered onauthenticated_session: Maintenance app driving (actions that don't have REST), Org-unit-tree drag-drop edits. Dashboard creation is covered by the RESTDashboardsAccessor.add_item; layout drag-drop is UI-only but deferred until a concrete need appears. - Scheduled jobs plugin (
/api/jobConfigurations) — blocked on BUGS.md #15 (undiscriminatedjobParameters+WebMessage.responseunions). Revisit when the OAS discriminator is fixed upstream, or when a concrete scheduling workflow forces us to hand-roll typed payloads for the common job types. - Interactive aggregate-data-entry TUI —
dhis2 data entry <ds> <pe> <ou>launches a terminal spreadsheet bound to one data set × period × org unit. Questionary or textual for the UI; posts viaclient.data_values.streamon save. Powerful offline-capable data-entry fallback when the UI is down.
Testing roadmap¶
The unique shape of this project — we generate code from a moving REST API, then hand-write CLI / MCP / auth layers on top — dictates the testing surface. Bugs slip in at five layers, each best caught with a different tool.
Layered overview¶
| Layer | What can break | Today | Strongest tool |
|---|---|---|---|
| Static | Type errors, unused imports, dead code | ruff + mypy + pyright (good) | + add deptry for unused / missing deps |
| Unit | Pure logic, parsers, builders | 872 tests, respx-mocked HTTP (good) | + property-based + mutation |
| Codegen | Generator emits wrong code | None (relies on humans reviewing the diff) | Golden snapshots of the generated tree |
| Schema contract | Generated code stops matching live API | None | Wire-vs-model + manifest drift |
| Live integration | End-to-end against real DHIS2 | v42 only, slow tests | Multi-version matrix + per-PR read-only |
| Examples | Documented usage drifts from reality | verify_examples exists but v42-only |
Snapshot outputs + parallel both-version |
| Upstream bugs | Workaround breaks; fix lands and we don't notice | Manual BUGS.md retest |
@pytest.mark.upstream_bug(<id>) lifecycle |
Tier A — high leverage, ~1 PR each¶
A1. Schema contract tests against the live play instances (per-PR, read-only).
Pick ~20 representative resources (DataElement, OrganisationUnit, DataSet, Program, ProgramStage, Indicator, User, …). For each: fetch one real instance from play.im.dhis2.org/dev-2-{42,43}, run it through the generated pydantic model, assert it validates without extra (or with only known-safe extras). The single highest-value addition — catches DHIS2 ship-day API changes before users do, and adds ~5 s to CI.
A2. BUGS.md regression-suite scaffolding.
A custom pytest marker @pytest.mark.upstream_bug("BUGS.md#7", state="present"). Each entry gets two paired tests:
- Workaround test — passes when our shim does its job. Breaks if we accidentally remove the workaround.
- Bug-still-present test — currently passes (asserts the bug is observable). When DHIS2 fixes it, this fails — the signal to delete the workaround.
Couple this with a small dhis2-utils bug-status reporter so the next BUGS retest is just pytest -m upstream_bug --tb=line. Replaces the manual curl spreadsheets.
A3. Multi-version CI matrix (carried-over from the near-term list).
.github/workflows/e2e.yml matrix on dhis2_version: [42, 43]. Blocked today only by the missing v43 e2e dump. Build that once (make refresh-and-verify DHIS2_VERSION=43, ~45 min one-time) and the matrix unlocks itself.
A4. Property-based tests for the parser-shaped code paths. Hypothesis is overkill for happy-path business logic but devastatingly effective for parsers. Targets:
generate_uid— distribution properties (no character bias, all 11 chars, 62-symbol alphabet).- Period parsing (
LAST_3_MONTHS,202403,2024Q1,2024S2,2024W12, …). - Filter DSL (
name:ilike:foo,code:in:[a,b], nestedattributeValues.attribute.id:eq:UID). - JSON Patch RFC 6902 round-trip — apply then invert; the composition should be a no-op.
- URL construction — no double-slashes, correct encoding,
.jsonsuffix on/api/analytics/*(BUGS.md #1).
One PR per parser, ~50 lines of hypothesis strategies + 5 properties each.
A5. Generated-code golden snapshots.
Today, codegen drift produces a giant diff that humans review by eye. Add tests/codegen/test_snapshots.py:
- Load committed
schemas_manifest.json. - Run
emit()into a tmp dir. - Diff against committed
generated/v{N}/. - Fail on any diff.
Combined with the existing dhis2 dev codegen diff CLI: codegen changes that drift the output must be deliberate (and update the snapshot). Catches the "innocent emit.py refactor that silently changes 200 files" class of bug.
Tier B — medium leverage, ~2-3 PRs¶
B1. Mutation testing nightly.
mutmut or cosmic-ray against packages/dhis2w-client/src/ and packages/dhis2w-core/src/plugins/*/service.py. Surface mutations that survive — each survivor is either a missing test or dead code. Run weekly (it's slow); fail when survivor count goes up vs baseline.
B2. Per-package coverage gates.
make coverage is workspace-wide at 70 %. That hides the case where dhis2w-client is at 95 % and a peripheral plugin is at 30 %. Split into per-package thresholds; show a coverage diff in PR comments via codecov / coveralls / a simple gh-action. Pin dhis2w-client higher than the rest since it's the public-API surface.
B3. Tracker write end-to-end test suite.
Tracker is the most error-prone area (envelope shapes, atomic / non-atomic modes, importStrategy semantics, soft-delete behaviour). An integration suite that creates a tracked entity with enrollment + events, updates each via PATCH, deletes them, verifies cleanup. Run nightly against both v42 and v43 to catch tracker-specific drift between versions.
B4. MCP tool catalogue contract test. Walk every tool registered by FastMCP, assert:
- Tool input schema is valid JSON Schema.
- Docstring is non-empty.
- Tool name follows
<plugin>_<resource>_<verb>convention. - Return-type annotation is a
BaseModel.
Stops the MCP surface from quietly degrading (missing docstrings, untyped returns).
B5. Live-instance smoke tests against play, parallel matrix.
Beyond contract tests (A1) — actual dhis2 system whoami, dhis2 metadata list dataElements --limit 5, etc., run against play.im.dhis2.org/dev-2-{42,43} in parallel. Catches "we shipped a release that actually works against real DHIS2."
Tier C — exotic / specialty¶
C1. Snapshot example stdout.
make verify-examples reports PASS / FAIL but doesn't pin output. Add --snapshot mode that records stdout into examples/.snapshots/. CI fails when output drifts unexpectedly. Catches "still passes" examples that produce subtly different / wrong output.
C2. Schema drift watcher (weekly cron).
Cron job that runs dhis2 dev codegen diff against the live play instances. If the committed manifest no longer matches what live reports, post an issue. The "DHIS2 just shipped 2.43.2" early-warning system.
C3. Performance benchmarks + regression detection.
pytest-benchmark for:
- CLI startup time (already on roadmap, ~2 s today, target < 400 ms).
- MCP
list_toolslatency. - Generated-code import time (the 562 OAS classes pydantic-rebuilds).
- Bulk fetch (1 k metadata items).
Store baselines in CI; fail PRs that regress > 20 %.
C4. Hypothesis-driven fuzzer for the OAS generator.
Generate adversarial OpenAPI specs (deeply nested oneOf, missing discriminators, recursive refs). Run oas_emit against them; assert it doesn't crash; collect cases where it does. One-time investment that finds latent oas_emit.py bugs.
C5. Browser / UI tests. Playwright is a runtime dep (for screenshot capture, OIDC login automation), not a test surface. The screenshot output IS the test today — compare PNG to a golden — and that's enough.
What we're explicitly skipping¶
- Load testing. Not a server; the bottleneck is always the upstream DHIS2 instance, not our client. Premature.
- Contract testing via Pact / Schemathesis. The OpenAPI spec is too unreliable (BUGS.md #14, #15, #28 are spec-quality issues). Our own contract tests against live instances pay better.
- Hypothesis-jsonschema for the OAS models. Tempting, but the
extra="allow"shapes spin Hypothesis on impossible negative cases. - Mutation testing on generated code. Mechanically derived; mutations there don't tell us anything we can fix.
Recommended order¶
The first three unblock everything else; A4 / A5 chase quickly behind:
- A3 — multi-version CI matrix (after a one-time v43 e2e dump build). Unblocks v43 in CI, makes B3 / B5 actually possible.
- A1 — live-schema contract tests against play, per-PR. Cheapest highest-leverage thing in this list.
- A2 —
BUGS.mdregression suite scaffolding. Stops the manual BUGS retest cycles. - A4 + A5 — property-based + codegen snapshots. Independent; either order.
Tier B and C defer until A1–A5 are paying off.
Reference: dhis2-java-client¶
Apache-2.0 Java client maintained by the DHIS2 org (dhis2/dhis2-java-client). Targeted comparison against this workspace as of this writing:
Already covered here¶
- Typed
/api/sharing—Sharing,SharingBuilder,ACCESS_*constants,apply_sharing/get_sharinghelpers. Full parity with the Java client's sharing builder. - User administration —
dhis2 user list / get / me / invite / reinvite / reset-password. User-group + user-role plugins covering membership + authority-bundle flows. - Branding / theming —
dhis2 dev customize logo-front/banner/style/set/apply/show+Dhis2Client.customizeaccessor. No equivalent in the Java client. - Auth providers (Basic, PAT, OAuth2); ours is async-first with a typed
AuthProviderProtocol. - Generated resource CRUD across v40/v41/v42/v43/v44 (Java is hand-maintained).
- WebMessageResponse envelope parsing;
.import_count(),.conflicts(),.rejected_indexes(),.task_ref(),.created_uid(). - Full metadata query surface; repeatable
--filter,--order,rootJunction=AND|OR,--page/--page-size,--all,--translate/--locale, everyfieldsselector form. - Metadata bundle export / import / diff + RFC 6902 patch with per-resource filters + dangling-reference warning on export.
- Paging;
list_raw(..., paging=True)returns the pager;list(..., paging=False)walks the full catalog. - Typed filter values on enum fields;
ValueType.NUMBERis aStrEnum, substitutable into filter strings directly. - Client-side UID generation; matches the Java
CodeGeneratoralgorithm exactly. - Typed tracker writes;
TrackerBundle+TrackerTrackedEntity/TrackerEnrollment/TrackerEventmodels forPOST /api/tracker. - Event + enrollment analytics; outlier detection + tracked-entity analytics.
Considered, not adopted¶
- Fluent query builder (
.addFilter(Filter.eq("name", "Penta"))): the Java client wraps DHIS2'sproperty:operator:valuestring syntax in a chainable builder. Deliberately skipped — Python f-strings makef"name:like:{name}"already readable; the builder doesn't buy type safety on the stringly-typed value side; DHIS2's own docs teach the string form.
Worth evaluating later (Java parity)¶
- Domain-specific response types beyond
WebMessageResponse: Java has distinctPagedResponse,Stats,Responsefor different endpoint shapes. We collapse intoWebMessageResponse+ helpers. The OAS codegen already emits the specific shapes (TrackerImportReport,ImportReport, etc.) — swap on-demand when a specific call site hits friction.
Beyond Java parity (already shipped)¶
Items that don't exist in the Java client and now exist here:
- Retry / backoff —
RetryPolicyonDhis2Client+open_clientwith exponential backoff, jitter,Retry-Afterhonoured, idempotent-only by default. - Library-level task awaiter —
client.tasks.await_completion(task_ref, ...)+client.tasks.iter_notifications(...). - Connection-pool tuning —
http_limitskwarg onDhis2Clientandopen_client. - Typed codegen across five DHIS2 versions via schema-driven emission; Java is hand-maintained.
- OAS spec-patches framework — synthesises the Jackson discriminators DHIS2's OpenAPI generator omits (
Route.authet al.). - Data-integrity streaming iterator —
client.maintenance.iter_integrity_issues()yields a flat stream ofIntegrityIssueRows tagged with owning-check metadata. - System metadata cache — TTL-bounded in-memory cache on
client.systemforinfo()/default_category_combo_uid()/setting(key). - Bulk metadata delete —
client.metadata.delete_bulk(resource_type, uids)+delete_bulk_multi({...}). - Cross-resource metadata search —
client.metadata.search(query)returns typedSearchResultsgrouped by resource; handles UID / partial UID / code / name in one verb. - Typed Visualization + Map + Dashboard builders —
VisualizationSpec,MapSpec+MapLayerSpec,DashboardSlot. Chart-type-aware dimension placement, typed data dimensions (DEs + indicators),RelativePeriodenum for rolling windows, legend-set support. CLI + MCP surfaces on top. - Per-viz + per-dashboard PNG capture —
dhis2 browser viz screenshot+dhis2 browser dashboard screenshot+dhis2 browser map screenshot, Chromium-driven via Playwright session helpers. - Typed bulk-save on every generated resource —
client.resources.<resource>.save_bulk(items). Supportsimport_strategy+atomic_mode+dry_run. client.metadata.dry_run(by_resource)— cross-resourceimportMode=VALIDATEentry point.- Streaming analytics export —
client.analytics.stream_to(destination, *, params, endpoint="/api/analytics.json"). - Messaging plugin —
dhis2 messaging {list,get,send,reply,mark-read,mark-unread,delete}+messaging_*MCP tools +client.messagingaccessor. - Validation + predictors workflow —
dhis2 maintenance validation {run,result,validate-expression,send-notifications}+dhis2 maintenance predictors run. - Streaming dataValueSets import —
client.data_values.stream(source, content_type=...). - Multi-instance metadata diff —
dhis2 metadata diff-profilesexports two profiles concurrently + diffs them. - Files plugin — CLI + MCP +
client.filesaccessor over/api/documents+/api/fileResources. - SQL views runner —
client.sql_views+dhis2 metadata sql-view {list, get, execute, refresh, adhoc}. - Tracker authoring workflows —
dhis2 tracker {register, enroll, add-event, outstanding}verbs + the matchingclient.trackerhelpers for operator flows beyond generic CRUD. - Rich conflict renderer —
dhis2 metadata import/dhis2 data aggregate importrender/api/metadataand/api/dataValueSetserror envelopes as a normalisedConflictRowtable (object UID → offending property → server message). - Apps plugin —
dhis2 apps {list, add, remove, update, update --all, reload, snapshot, restore, hub-list, hub-url}+apps_*MCP tools +client.appsaccessor over/api/appsand/api/appHub.update --all --dry-runpreviews available hub updates before installing; bundled core apps update in place.hub-list --searchfilters the catalog client-side.hub-urlread/writes thekeyAppHubUrlsystem setting so self-hosted hubs can be wired via CLI.snapshot --outputpins an instance's app inventory to a portable JSON manifest;restore <manifest>reinstalls every hub-backed entry viainstall_from_hub, with a--dry-runpreview that mirrorsupdate --all --dry-run. - Metadata cross-instance merge —
dhis2 metadata merge <source-profile> <target-profile> --resource ... [--dry-run]orchestrates export+import in one pass, returning typed per-resource export counts plus the target import'sWebMessageResponse. Pairs withdiff-profiles(same resource+filter shape): diff to preview, merge to apply. Sharing blocks are stripped by default to avoid false-positive conflicts from per-instance user/group UIDs. Conflicts on the dry-run and applied paths render through the sharedConflictRowRich table used bymetadata import(#177), so preview output is immediately actionable without reaching for--json | jq. - Canonical
X / XGroup / XGroupSetauthoring triples — sub-apps underdhis2 metadata, one client accessor per resource, following a single canonical-naming rule (lowercase + hyphenate the DHIS2 resource path). Shipped fororganisation-units(#174),data-elements(#175),indicators(#176), andcategory-options(#181), plus theprogram-indicatorspair (#180 — DHIS2 has noprogramIndicatorGroupSet). Each PR adds 15–19 MCP tools, full CLI verbs (list/get/create/ rename-like / per-item membership), and hand-written accessors that return typed generated models. No*Specbuilders — keyword args on the accessor (continues the spec-audit data point). The indicator accessor exposesvalidate_expression(context="indicator"), the program-indicator accessorvalidate_expression(context="program-indicator"), so callers can pre-flight numerator / denominator / expression references before a failed create.category-optionsadditionally shipsset_validity_window(uid, start_date, end_date)for the validity-window knob unique to that resource. - Aggregate data-set authoring —
dhis2 metadata data-sets+sectionssub-apps (#185).DataSetElement+Section.dataElements[]are handled as join tables with round-trip helpers:add_element(ds_uid, de_uid, category_combo_uid=...)carries the per-set CC override;sections.reorder(section_uid, [de_uids])replaces the ordered DE list in one PUT. Docstring calls out the DSE self-ref strip for DHIS2's read/write asymmetry. - Validation-rule + predictor CRUD —
dhis2 metadata validation-rules+predictors+ their groups (#186). Closes the author-then-run gap —dhis2 maintenance validation run/predictors runshipped long ago, but rules + predictors themselves couldn't be authored from CLI. Surface assemblesleftSide/rightSide/generatorExpression sub-objects from plain kwargs. - Bulk RFC 6902 patch —
client.metadata.patch_bulk(resource, [(uid, ops), ...], concurrency=8)+patch_bulk_multi(...)(#187). Client-side fan-out under a semaphore; per-UID failures land inBulkPatchResult.failures(withuid/resource/status_code/message) instead of raising. Building block for future CLI-level bulk verbs. - Bulk sharing —
client.metadata.apply_sharing_bulk(resource_type, uids, sharing)+apply_sharing_bulk_multi(by_resource, sharing)fan out oneSharingBuilderpayload across many UIDs under a concurrency semaphore. CLI surface asdhis2 metadata share <type> [UID...]with--public-access/--user-access UID:access/--user-group-access UID:access(repeatable) + stdin UID input via-sometadata list ... \| jq -r .id \| xargs metadata sharecomposes. Per-UID failures land inBulkSharingResult.failureswith the same row-level table renderer used byrename/retag. - Category dimension authoring (complete end-to-end) —
dhis2 metadata categories(#205) +category-combos+ read-onlycategory-option-combos(#208) + the one-passCategoryComboBuilderhelper (#209). Categories accept ordered--option UIDflags on create + per-itemadd-option/remove-optionshortcuts. CategoryCombos accept ordered--category UIDflags + await-for-cocs --expected Nmatrix-poll barrier handling DHIS2's async COC regeneration (cold-start can take tens of seconds, especially under arm64 emulation). Thecategory-combos build --spec FILEverb walks a declarativeCategoryComboBuildSpec(JSON or stdin) and ensures every CategoryOption -> Category -> CategoryCombo exists; idempotent, returning a typedCategoryComboBuildResultwith per-layer created-vs-reused breakdown. - Bundle-source metadata merge —
dhis2 metadata merge-bundle <target> <bundle.json>(#206) imports a saved JSON bundle into a target profile. Sibling to the source-profilemergeverb; same--strategy/--atomic/--include-sharing/--dry-runknobs. Useful when the bundle came from a savedmetadata export, was hand-crafted, or was produced by a non-DHIS2 tool.MergeResult.source_base_urlisbundle:<path>for traceability. - Tracker-schema authoring (complete end-to-end) —
dhis2 metadata tracked-entity-attributes+tracked-entity-types(#188) covers the leaf resources;tracked-entity-types add-attribute --mandatory --searchableround-trips the TETA join table.dhis2 metadata programs {list, get, create, rename, add-attribute, remove-attribute, add-to-ou, remove-from-ou, delete}(#189) covers the middle layer — WITH_REGISTRATION / WITHOUT_REGISTRATION program flavours, PTEA enrollment form linkage, per-item OU shortcuts.dhis2 metadata program-stages {list, get, create, rename, add-element, remove-element, reorder, delete}(#194) covers the inner layer — each stage's orderedprogramStageDataElements[]join table withcompulsory/displayInReports/allowFutureDate/allowProvidedElsewhereflags. Documents the DHIS2mergeMode=REPLACErequirement on Program + ProgramStage PUT (nested-list removal is additive without it) as a typed client-side workaround. - Codegen + base-client gap closure (#190–#192, #197). Generated
create(item, *, merge_mode, import_strategy, skip_sharing, skip_translation)+update(item, ...)forward the write-flag query params. Every generated resource exposesadd_collection_item(parent_uid, collection, item_uid)/remove_collection_item(...)for per-item POST/DELETE shortcuts. BaseDhis2Clientships typedpost(path, body, model=T)+put(path, body, model=T)wrappers (parallels the existing typedget). Hand-written accessor sweep across 28+24+10 files replaced_put_with_replace/ per-item loops / duplicated_uid_from_webmessagehelpers / single-objectget_raw + model_validate/ pagedlist_allviaresources.X.list(...)with the new surface. ~700 lines of duplication removed with no behavior change. dhis2 metadata rename+metadata retagverbs — bulk CLI verbs on top ofclient.metadata.patch_bulk(#195, #199, #200).renamehandles label-field add / strip prefix + suffix (idempotent both directions — won't double-apply, won't no-op-fail).retaghandles ref-field rewrites (categoryCombo,optionSet,legendSets) + enum field rewrites (aggregationType,domainType). Both take--filter(repeatable, same DSL asmetadata list) +--dry-run+--concurrency. Per-UID failures land in the sharedConflictRowrenderer used bymetadata import, so operators see row-level detail on partial failures.- CI coverage gate + failure threshold (#196, #202).
make coveragereplacesmake testin the CI test step; every run uploadscoverage.xmlas an artifact retained 14 days, and fails the build if coverage drops under 70%. Current baseline 73% (85k statements / 7.5k branches). - Playwright-driven OIDC login —
dhis2 profile login --no-browserprints the auth URL for copy-paste;dhis2w_browser.drive_oauth2_login(profile, user, pw)drives the full flow via Chromium (React login → Spring AS consent → loopback redirect) for CI + headless use cases.examples/cli/profile_oidc_login.sh+examples/client/oidc_login.pyauto-dispatch to the Playwright path whenDHIS2_USERNAME/DHIS2_PASSWORDare in env. - Predictor + validation seed fixtures — the Sierra Leone play42 snapshot now ships 2 BCG predictors (
avg+sumover 3-month windows) + a PredictorGroup + 2 output DEs, plus 2 BCG validation rules + a ValidationRuleGroup that reliably produce violations.dhis2 maintenance predictors run --groupanddhis2 maintenance validation run --grouphave concrete targets out of the box. - Interactive CLI pickers —
dhis2 profile defaultlaunches an arrow-key menu viaquestionary.
Beyond Java parity (not yet)¶
(Empty — major Java-parity gaps are closed.)
Explicit non-goals¶
- Python < 3.13. New typing features (StrEnum, TypeAliasType, PEP 604 unions, PEP 695 generics) justify the bump.
- DHIS2 < v42. Every backport fork splits the code; no deployed users so no reason.
- Flask / argparse / raw stdio MCP loops / hand-rolled TOML parsers; every slot has a chosen standard per the CLAUDE.md hard-requirements list in the repo root.
- A second filter DSL layered on top of DHIS2's
property:operator:valuestring syntax. See the dhis2-java-client comparison above for the rationale. - Synchronous client variant.
asyncthroughout is a hard requirement. dict[str, Any]crossing module boundaries. CLAUDE.md hard rule; enforced workspace-wide as of the typing sweep (#71-#74, #76). New code that proposes dict-in-signature needs explicit justification referencing a specific HTTP-boundary carveout.- Public distribution — no PyPI, no tagged releases, no
CHANGELOG.md. Workspace stays internal. dhis2 program-rule trace/ rule simulator — explicitly declined.
How this file gets updated¶
Greenfield voice; edits describe the current state of the plan, not its history. When a near-term item ships, delete it from the "near-term" list (don't rewrite to "already shipped"). Use the PR's own description for the history; this file is always about what's next.