Roadmap¶
A running inventory of what the workspace covers today, gaps surfaced during use, and the near-term plan. Pre-1.0, no deployed users; every item is a judgment call about priority, not a commitment.
Current state¶
Workspace surface¶
| Package | Role |
|---|---|
dhis2w-client |
Async HTTP client, pluggable auth, typed responses via generated models. Retry policy, task awaiter, connection-pool tuning all first-class. |
dhis2w-codegen |
/api/schemas → pydantic emitter + OAS spec-patches framework (synthesises Jackson discriminators upstream DHIS2 omits) |
dhis2w-core |
Plugin runtime + shared services (profiles, CLI errors, task watch, client context) |
dhis2w-cli |
Typer root that discovers every plugin (first-party + entry-point) |
dhis2w-mcp |
FastMCP server that mounts the same plugins (the full typed surface) |
dhis2w-mcp-bridge |
Single-tool (dhis2_cli) MCP bridge for small on-box models; read-only + host write-guard |
dhis2w-mcp-router |
Search+dispatch MCP router fronting upstream servers (portable ToolSearch); experimental, domain-neutral core, not yet published |
dhis2w-browser |
Playwright session helpers (auth through the DHIS2 login form) |
dhis2w-bench |
Local/cloud LLM benchmark harness (workspace-only, unpublished) |
CLI surface¶
Eighteen top-level domains: analytics, apps, browser, data, dev, doctor, files, maintenance, messaging, metadata, profile, route, schema, security, system, user, user-group, user-role. Each plugin shares a service.py between the CLI and MCP sides; the same typed call from both surfaces.
d2w metadata has the full workflow surface:
- Core CRUD:
list/get/patch(RFC 6902) /rename(bulk name/shortName/description add + strip prefix/suffix,--dry-run) /retag(bulk ref-field + enum rewrites: categoryCombo, optionSet, legendSets, aggregationType, domainType) /share(bulk apply one sharing block to many UIDs, with--public-access/--user-access UID:access/--user-group-access UID:access, stdin UID input via-,--dry-run) /merge-bundle(import a saved JSON bundle file into a target profile — sibling to the source-profilemergeverb). - Cross-resource search:
d2w metadata search <query>— fans out three concurrent/api/metadata?filter=<field>:ilike:<q>calls (id,code,name) and merges by UID. Full UID, partial UID, business code, or name fragment all flow through one verb. - Bundle operations:
export/import/diff(file-vs-file and file-vs-live) with per-resource filters + dangling-reference warning on export;diff-profilesfor staging-vs-prod drift. - Authoring sub-apps:
options get / find / syncfor OptionSet sync;attribute get / set / delete / findfor cross-resource AttributeValue workflows;program-rule get / vars-for / validate-expression / where-de-is-used;sql-view list / get / execute / refresh / adhoc;viz list / get / create / clone / delete;dashboard list / get / add-item / remove-item;map list / get / create / clone / delete;legend-sets list / get / create / clone / delete; four fullX / XGroup / XGroupSetauthoring triples with canonical DHIS2 naming —organisation-units/organisation-unit-groups/organisation-unit-group-sets(plusorganisation-unit-levelsfor per-depth rename),data-elements/data-element-groups/data-element-group-sets,indicators/indicator-groups/indicator-group-sets, andcategory-options/category-option-groups/category-option-group-sets; plus theprogram-indicators+program-indicator-groupspair (DHIS2 has noprogramIndicatorGroupSet). Aggregate data-set surface:data-sets list / get / create / add-element / remove-element / delete+sections list / get / create / add-element / remove-element / reorder / delete. Authoring flip side of maintenance runs:validation-rules {list,show,create,delete}+validation-rule-groups+predictors {list,show,create,delete}+predictor-groups. Tracker-schema authoring complete end-to-end:tracked-entity-attributes+tracked-entity-types(with TETA linkage) +programs {list,show,create,rename,add-attribute,remove-attribute,add-to-ou,remove-from-ou,delete}+program-stages {list,show,create,rename,add-element,remove-element,reorder,delete}. Category-dimension authoring complete end-to-end:categories {list,show,create,rename,add-option,remove-option,delete}+category-combos {list,show,create,rename,add-category,remove-category,wait-for-cocs,delete,build}(thebuildverb is the one-pass create-or-reuse helper for the full stack, fed a JSONCategoryComboBuildSpec) + read-onlycategory-option-combos {list,show,list-for-combo}.
d2w doctor runs ~100 checks on a live instance (20 metadata-health probes + 81 DHIS2 integrity checks + BUGS tripwires).
MCP surface¶
Roughly 304 tools across 13 plugin groups (analytics_*, apps_*, customize_*, data_*, doctor_*, files_*, maintenance_*, messaging_*, metadata_* (~197), profile_*, route_*, system_*, user_*). Counts age with each release; the auto-regenerated MCP reference is the source of truth. Most operational CLI commands have a matching MCP tool; d2w dev, d2w browser, and profile mutations are intentionally CLI-only (see the capability matrix).
There are three MCP surfaces over this tool set — the full server, the single-tool bridge, and the search+dispatch router. The MCP surfaces map compares them and explains how to choose; all three carry a *_READONLY guard.
Typed models shipped¶
Via /api/schemas codegen (generated/v{41,42,43}/schemas/):
- 100+ metadata resources (DataElement, DataSet, OrganisationUnit, Indicator, Program, …) with full CRUD accessors including the RFC 6902
patch(uid, ops)method - 77+
StrEnums for CONSTANT properties (ValueType, AggregationType, DataElementDomain, …) - A shared
Referencewith bothidandcodefields
Via /api/openapi.json codegen (generated/v{N}/oas/, currently populated on v41, v42, v43):
- Every
components/schemasentry — 562 classes + 260 StrEnums + 104 aliases on v42; 984 classes on v43. - Consumers in
dhis2w-client:envelopes.py,auth_schemes.py,aggregate.py,system.py,maintenance.py, andgenerated/v42/tracker.pyare all thin shims over the OAS output. - Emitter is deterministic + version-scoped;
d2w dev codegen oas-rebuild --version v{N}regenerates from the committedopenapi.jsonwithout network. - Spec-patches framework for known-upstream OAS gaps (
dhis2w_codegen.spec_patches). Each patch is idempotent + carries abugs_refpointer; the rebuild log names which gap was worked around. Current patches:*AuthSchemediscriminators (BUGS.md #14 — still unfixed in v43).
Remaining hand-written in dhis2w-client (by design):
WebMessageResponsesubclass +DataIntegrityReport/DataIntegrityResult/Me/Notification— helper methods and client-side convenience shapes that aren't in OpenAPI.AnalyticsMetaData— typed parser helper overGrid.metaData(a baredict[str, Any]on the wire).Grid/GridHeadercome straight from the OAS codegen.TrackerBundle— thePOST /api/trackerenvelope isn't in OpenAPI under that name. Thin wrapper on OAS tracker models.PeriodType+RelativePeriodStrEnums (24 period frequencies + 45 rolling windows; upstream Java enums the OpenAPI schema doesn't expose — see BUGS.md #28).
Typing posture¶
The four-PR typing sweep (#71-#74) plus the codegen discriminator synthesis (#76) eliminated every dict[str, Any] signature that crosses module boundaries outside the explicit HTTP-boundary carveouts. Every service-layer function returns a typed pydantic model; MCP tools dump at the edge via _dump_model; CLI handlers dump for JSON output or Rich tables. The CLAUDE.md "no dict[str, Any] across module boundaries" rule is enforced workspace-wide.
Runtime features¶
--profile/-pglobal override +~/.config/dhis2/profiles.tomlor./.dhis2/profiles.tomlauto-discovery--debug/-dglobal flag → stderr HTTP trace lines viadhis2w_client.httplogger--watch/-won job-kicking commands (analytics refresh,maintenance dataintegrity run) + standalonemaintenance task watchwith Rich progress UI--jsonopt-in on every write command; concise one-line summary by default- Typed
Dhis2ApiError.web_messageparses the envelope on 4xx so the CLI surfacesconflicts[]/importCount/rejectedIndexes[]detail - Client-side UID generation (
generate_uid,generate_uids); no/api/system/idround-trip - External plugin loading via
importlib.metadata.entry_points(group="dhis2.plugins")— seeexamples/plugin-external/for a minimal runnable reference - Retry policy with exponential backoff + jitter +
Retry-Afterheader honouring. Idempotent-only by default; opt in for POST/PATCH per policy. Threads throughDhis2Client(retry_policy=...)andopen_client(profile, retry_policy=...). - Library-level task awaiter —
client.tasks.await_completion(task_ref)blocks until DHIS2 reportscompleted=True;iter_notificationsfor streaming renderers. - Connection-pool tuning —
Dhis2Client(http_limits=httpx.Limits(...))/open_client(profile, http_limits=...)for sizing against the real DHIS2 capacity. - Data-integrity streaming iterator —
client.maintenance.iter_integrity_issues(...)yieldsIntegrityIssueRows (issue + owning check's name / displayName / severity) as a flat stream. - Files plugin —
d2w files documents {list,get,upload,upload-url,download,delete}+d2w files resources {upload,get,download}. - System metadata cache — TTL-bounded per-client in-memory cache on
client.systemforinfo()/default_category_combo_uid()/setting(key). 300 s default TTL. - Bulk delete on
client.metadata—delete_bulk(resource_type, [uids])+delete_bulk_multi({...})wrapPOST /api/metadata?importStrategy=DELETE. client.metadata.search— cross-resource UID / code / name search; three concurrent/api/metadata?filter=<field>:ilike:<q>calls merged client-side with UID dedup. TypedSearchResults(query, hits: {resource: [SearchHit, ...]}, total).client.visualizations—VisualizationSpectyped builder (chart type, data elements, indicators, periods, relative periods, legend set, placement overrides) +create_from_spec / clone / list / deleteaccessor.RelativePeriodStrEnum covers the 45 rolling windows upstream OpenAPI exposes as boolean flags.client.maps—MapSpec+MapLayerSpectyped builder withindicators,legend_set, thematic / boundary / facility layer kinds; parallels the viz accessor.client.dashboards—DashboardSlot+add_item/remove_itemon the dashboards accessor, no round-trip of the whole dashboard.- Streaming data-value-set import —
client.data_values.stream(source, content_type=...)feeds httpx's chunked transfer directly from aPath,bytes, sync / async iterable, or async generator. JSON / XML / CSV / ADX. - Streaming analytics export —
client.analytics.stream_to(destination, *, params, endpoint="/api/analytics.json")pipes httpx's chunked response straight to disk viaaiter_bytes. - Multi-instance metadata diff —
d2w metadata diff-profiles <a> <b> -r <resource>exports two registered profiles concurrently and diffs them structurally.
Seed fixture¶
The committed e2e dump (infra/v42/dump.sql.gz) mirrors DHIS2 Play's Sierra Leone immunization demo with workspace-local additions: 1332 org units with GeoJSON geometries, 67 data elements, 3 indicators, 3 programs (Child Programme + Antenatal = tracker; Supervision visit = event), 2 datasets, 3 dashboards, 23 visualizations built programmatically via VisualizationSpec + 1 EventVisualization for the supervision program attached to the Immunization data dashboard, 8 maps built via MapSpec, 188k aggregate data values, 500 tracker entities, 12 sample supervision events covering 2024 monthly, 6 program rules + 10 program indicators. Workspace fixtures layered on top (infra/scripts/seed/workspace_fixtures.py): SNOMED_CODE attribute, VACCINE_TYPE option set with 5 fixed-UID options, 3 SqlViews (VIEW / QUERY / MATERIALIZED_VIEW), 2 BCG predictors + PredictorGroup + 2 output DEs, 2 BCG validation rules + ValidationRuleGroup, 4 named OrganisationUnitLevel records (Country / Province / District / Facility), 1 LegendSet (LsDoseBand1) attached to the Measles + Penta-1 monthly column charts. make refresh-and-verify wipes the stack, rebuilds the dump, runs every non-interactive example end-to-end, and reports a pass/fail summary as the regression gate. Skipped examples are the ones that need a real browser session (OIDC-login flows including the Playwright-driven variant), out-of-process screenshots, very long-running analytics jobs, or external network deps — the make target prints the per-run count.
CI¶
.github/workflows/ci.ymlrunsmake lint && make test && make docs-buildon every PR.github/workflows/e2e.ymlnightly — full DHIS2 stack + seeded fixtures + slow integration tests
Public distribution is now active — every workspace member (except dhis2w-codegen) publishes to PyPI under its own name. Tags use the vX.Y.Z scheme + a CHANGELOG.md lives at the repo root. See Releasing to PyPI for the cut workflow.
Docs¶
- Auto-generated CLI reference (
docs/cli-reference.md, ~10,300 lines from the Typer app) + MCP reference (docs/mcp-reference.md, roughly 304 tools across 13 groups from the FastMCP server). Both regenerated on everymake docs-build; the counts age with each release. - Narrative tutorials:
docs/guides/cli-tutorial.md,docs/guides/client-tutorial.md,docs/guides/visualizations.md(step-by-step viz + dashboard composition). - Examples index (
docs/examples.md) catalogues the canonical v42 example set spread across cli / client / mcp on the v42 tree; v41 + v43 mirror most of them. Per-version totals printed byls examples/v{41,42,43}/{cli,client,mcp}/(the source of truth). Tracker-schema authoring examples (steps 1 / 2 / 3 underexamples/v42/cli/tracker_*.sh) round-trip the full chain end-to-end. - Architecture docs cover every plugin, the client, auth, profiles, codegen, typed schemas, plugins runtime, external plugins, MCP, versioning, browser automation.
BUGS.md— nearly 40 upstream DHIS2 quirks with livecurlrepros + v43 re-audit status (entry count drifts as new ones land; the file itself is the source of truth).
Test coverage¶
Roughly 1,180 tests collected (uv run pytest --collect-only -q | tail -1 is the source of truth); the mocked tier runs in seconds via make test, and the slow-marked + contract tiers run in make test-slow / make test-contract against a live stack (Playwright PAT creation, dashboard screenshot capture, Playwright-driven OIDC login, contract tests against play.im.dhis2.org/dev-2-{42,43}). Unit + CliRunner + respx-mocked HTTP; integration paths use in-process FastMCP Client against the real plugin tree. make coverage runs branch-coverage locally + on every CI run (produces coverage.xml as an artifact); the per-PR floor is set at 70%.
Detailed test gaps + the planned next moves are in Testing roadmap below.
Upstream quirks tracked¶
Roughly forty entries in the repo-root BUGS.md (the file is the source of truth — grep -c '^- \[#' BUGS.md prints the live count). Recent additions cover the seed / workflow cycle: DataSet Hibernate flush ordering (#23), Person-TET built-in name collisions (#24), /api/.../metadata leaking computed fields (#25), admin OU scope cached per session (#26), fresh-install flakiness on first metadata import (#27), RelativePeriods OAS schema shape (#28), /api/metadata ignoring rootJunction (#29 — the reason metadata search has to fan out N requests instead of one), App Hub versions[*].created returning epoch-millis ints instead of ISO-8601 strings (#30), and the predictor-expression parser rejecting uppercase aggregators (#31 — forces avg() / sum() lowercase even though DHIS2 docs use uppercase). The v43-specific cluster (#33–#38) plus the v41 OAuth2 wire-shape quirk (#39) round out the recent set.
Gaps surfaced during use¶
Authoring surfaces¶
The organisation-unit PR (#174) set a template — canonical DHIS2 resource names, hand-written accessors, per-item membership shortcuts, no *Spec. The triples sweep (#174 / #175 / #176 / #180 / #181), aggregate data-set surface (#185), validation-rule + predictor CRUD (#186), the full tracker-schema stretch (TET + TEA #188, Program + PTEA + OU #189, ProgramStage + PSDE #194), and the category-dimension stack (Category #205, CategoryCombo + read-only CategoryOptionCombo #208, the one-pass CategoryComboBuilder helper #209) have all landed on top of it. No metadata-authoring gaps remain on the main workflow paths.
Optional ProgramStageSection grouping (rarely used in practice) is still unauthored; reach for metadata patch for it. That's the only known absence and it stays parked unless a concrete caller surfaces.
Security plugin: read-surface build-out¶
d2w security ships its first command — settings (the security slice of
/api/systemSettings: password policy, credential expiry, registration, lockout).
Deliberately small and read-only, built to grow one command at a time. The
security plugin page carries a step-by-step
extension recipe (service -> cli -> sweep v41/v43 -> example x3 -> docs -> test, plus
how to add an MCP surface). Candidate next commands, read-only first:
d2w security whoami— authenticated user + roles + authority count (/api/me; typedMeexists).d2w security authorities— effective authorities (/api/me/authorities).d2w security password-policy --lint— pass/warn checks oversettingsagainst a baseline (sibling of thedoctorprobe model).d2w security sharing-defaults— default public-access / authority-grant settings for new metadata.
Writes (rotating credentials, toggling registration, editing security settings) stay out of scope until a concrete caller needs them.
OIDC / OAuth2 polish¶
- Token refresh is tested in code but undocumented for end users.
Local OIDClogin-page button is non-functional for browser clicks (CLI-onlyredirect_url); no per-provider "hide from login UI" flag in DHIS2 v42 — documented indocs/architecture/auth.md.- Bearer-to-JSESSIONID path for browser workflows on OIDC profiles is unverified (flagged in
authenticated_sessiondocstring).
Metadata listing consolidation¶
Listing collapsed onto one surface — generic metadata list <type> + the metadata_list MCP tool (see the 2026-06-04 decisions-log entry). Three follow-ups remain:
- Re-expose type-specific list filters + curated columns. The dropped typed lists had ergonomic filters (
--domain-type,--program-type,--period-type, viz--type, …) and resource-aware columns. They currently round-trip through the generic--filter <prop>:<op>:<value>DSL. Design how to surface the common ones on the canonical command/tool (named convenience flags? a per-resource filter registry?) before migrating docs/examples, so the rewrites aren't redone. - Guard the
/api/metadata?<resource>=truebundle export against giant payloads. For organisation units this can embed geojson geometry and balloon to a size that can overload the server. Needs a size/field guard (or a refusal with a--fieldshint) onmetadata export; warrants aBUGS.mdentry once characterized with a repro. - Migrate docs/examples — largely done. The stale references were swept: the removed
metadata_<type>_listMCP example calls moved to the genericmetadata_list(resource=...), the removedoption-sets attributeCLI subgroup tometadata attributes, anduser-group/user-roletouser group/user role; the showcase doc examples were fixed.infra/scripts/check_example_refs.py(wired intomake check-examples+ CI) now resolves every example's CLI command paths against the Typer tree and everycall_toolname against the live MCP tool set, so this class of drift fails the fast suite instead of surfacing only in nightly e2e.
Small-model bridge: CLI read-surface follow-ups¶
Surfaced by the dhis2w-mcp-bridge gap probes (small local models driving the CLI). Shipped: camelCase discovery + did-you-mean, type list --json, show→get help, the rewritten dhis2_cli docstring (incl. search/usage/field-presets/nested-filters/export-warning + a WRITES primer), single-string-arg tolerance, paging help, --filter nested/in/null help, read-only allowlist for metadata usage/export, the analytics/tracker/aggregate help-text fills, the tracker --program fix, malformed-UID pre-validation on metadata get (BUGS #42), relationship mutators + deletes honor --json, headless route create (--no-auth + clean ValidationError), files documents list --details (no more filename-as-FR-UID), and metadata share accepting the plural type. See docs/notes/small-model-bridge.md + docs/notes/bridge-verification.md. Remaining:
- Removed typed
listdiscoverability — pointmetadata <subapp> list/showatmetadata list <type>/get(hidden redirect commands or epilog). Missing authoring verbs:Shipped:optionSets+userGroupscreate/deletemetadata option-sets create/delete+user-group create/delete(build the schema, POST viaresources.<accessor>.create, return a typed WebMessageResponse). v41/v42/v43 + tests + examples.Inline tracker deleteShipped:data tracker delete/event delete/enrollment deleteviadelete_tracker_objects(minimal bundle + importStrategy=DELETE). v41/v42/v43 + tests + examples.- Removed typed
listdiscoverability — pointmetadata <subapp> list/showatmetadata list <type>/get(hidden redirect commands or epilog). data aggregate getis keyed by dataSet,setby dataElement — can't verify a write with the same key; consider a--defilter onget.
Near-term plan (next 3–5 PRs)¶
Latest cycle closed the category-dimension strategic option (Category #205, CategoryCombo + read-only CategoryOptionCombo #208, the one-pass CategoryComboBuilder create-or-reuse helper #209) plus the smaller metadata merge-bundle verb (#206). With every authoring path on the main workflow now covered, the codegen emitters fully regen-stable, and bulk verbs (rename / retag / share) shipped on top of patch_bulk / apply_sharing_bulk, the obvious tactical sweep is complete.
The near-term slate is once again open. The multi-version CI integration matrix (long-standing carry-over) and the *Spec-class audit are both resolved — the matrix runs e2e.yml across dhis2_version: [42, 43] nightly; the spec audit settled on VisualizationSpec / MapSpec + MapLayerSpec / LegendSetSpec + LegendSpec (the rule for when a spec is justified is documented on api/legend-sets.md).
The natural next direction is one of:
- Pick one of the two remaining strategic options below and commit to a multi-PR body of work (data approval workflow, or audit log reader).
- Promote a medium-term tactical item (CLI startup latency, property-based DSL tests) for a focused 1-PR cycle.
- Land A1 (live-schema contract tests against play) — now first in the recommended testing order.
Demoted / parked:
apps snapshotexample + CI hook — the feature works, just therestore --dry-rundemo still isn't inexamples/v42/cli/apps.sh. Low value without an active need.ProgramStageSectiongrouping — rarely used in practice;metadata patchcovers the occasional need. Promote if a concrete caller surfaces.
BUGS.md #15 (undiscriminated JobConfiguration.jobParameters + WebMessage.response unions) stays off the near-term list: the sibling-field discriminator pattern doesn't fit the AuthScheme-style spec-patches approach, and the scheduler plugin isn't an active workflow. Revisit when someone hits a real-world need.
Strategic options (pick one before the next cycle)¶
Two independent directions — the right order depends on where the pain is. Each would be a multi-PR body of work.
1. Data approval workflow plugin¶
/api/dataApprovals + /api/dataApprovalLevels + /api/dataApprovalWorkflows cover multi-level aggregate approval (district → zone → ministry sign-off). Common in humanitarian + government reporting pipelines. Surface:
d2w dataapproval status <ds> <pe> <ou>— which level is this cell at?APPROVED_HERE / APPROVED_ABOVE / UNAPPROVED_READY / UNAPPROVED_WAITING.d2w dataapproval approve / unapprove / accept / unaccept— the four write verbs.d2w dataapproval bulk-status <ds> <pe>— every org unit for one dataset-period, exit-on-incomplete mode for CI.- Typed
DataApprovalStatusenum + level-aware state machine.
2. Audit log reader¶
DHIS2's /api/audits/* endpoints track every write by user / timestamp / entity-uid (for DE values, tracker payloads, metadata changes). No wrapper today; integrations that need a "who changed X and when" history have to hand-build URLs.
d2w audit data-values --de <uid> [--ou <uid>] [--pe <pe>]— stream every change for a cell.d2w audit metadata --klass DataElement --uid <uid>— metadata edit history for one resource.d2w audit tracker-entity <uid>— tracker write audit.client.audit.iter_data_values(...)/iter_metadata(...)async iterators for library callers.
Niche but valuable for compliance + forensics use cases.
Medium-term¶
- Multi-version CI matrix — shipped (#236).
e2e.ymlruns nightly acrossdhis2_version: [41, 42, 43]matrix. - Cold-open / import latency (priority: client > MCP > bridge). What matters is how fast a process becomes usable — library scripts (
from dhis2w_client import Dhis2Client), eachdhis2w-mcp-bridgespawn, and the MCP server boot — notd2w --help. Measured withpython -X importtimeon the venv interpreter (post-#395): from dhis2w_client import Dhis2Client— ~530 ms (bareimport dhis2w_clientis already ~3 ms via the package's PEP 562__getattr__).importtimeattributes ~458 ms todhis2w_client.generated.v42.oas: importing the client pullsgenerated/v42/resources.py, which doesfrom .schemas.<x> import <Class>at module top for every accessor → all 562 generated schema classes load eagerly, even though a given call touches a handful. This is the top lever in the repo. Fix: lazy schema imports on the resource accessors (PEP 562__getattr__ongenerated/v{N}/resources.py+schemas/__init__.py, or per-accessor deferred import), so client cold-open falls toward the ~60 ms httpx+pydantic floor. Target < 100 ms. Effort: medium — codegen-template change across v41/v42/v43, and must keep FastMCP forward-ref resolution working (theMockValSer/SchemaSerializertrap that sank the earlierperf/lazy-oas-initbranch — siblings must still resolve when only one schema submodule is loaded).- MCP
build_server()— ~2.0 s —_eager_rebuild_tool_return_types()'smodel_rebuild()loop (~900 ms) + the 458 ms client/OAS import + fastmcp/mcp framework. Fixing the client lazy-import removes the 458 ms for free; the rebuild stays correct-by-necessity (FastMCP serialises tool returns without prior validation) but could be narrowed to only classes reachable from a registered tool, or deferred to first-serialise per class. One-time on a long-lived server → second priority. import dhis2w_mcp_bridge— ~440 ms —fastmcp(~196 ms) +mcp(~152 ms) framework import; the bridge shells out tod2wso it never touches the OAS tree. Matters because every local-model bridge round cold-opens the process. Less actionable (third-party); defer thefastmcpimport until the server starts, or trim to the minimal MCP server surface. Lowest of the three.- Stale claim corrected: the old "
d2w --help~2 s, target < 400 ms" no longer holds — post-#395,d2w --versionandd2w metadata --helpare ~0.6 s. The remaining real cost is the client OAS import above, not Typer/plugin construction. - Other measured perf items (independent; from a sweep — file:line + rough effort/payoff):
apps update_allis sequential (v42/plugins/apps/service.py:213) — N installed apps updated one POST at a time;asyncio.gatherthem. ~5 min, saves N-1 round-trips.- HTTP pool defaults untuned (
dhis2w-client/v42/client.py) — httpx defaults (100 max / 20 keepalive) cap high-fan-outasyncio.gathercallers; document/raisehttp_limitsfor bulk workflows. - Browser screenshot loops sleep a fixed 2 s/item (
v42/plugins/browser/service.py:346,465) — poll for render (Playwrightwait_for_load_state) instead; ~1-2 s/item on large map/visualisation runs. verify-examplesruns strictly sequentially (infra/scripts/verify_examples.pyrun_suiteloop) — ~1000 s for ~180 examples against one shared stack. Read-only examples (~60%) could run under a bounded asyncio semaphore (3-4×) with writes kept serial; biggest CI/dev-loop win but medium effort + write-race risk.verify-examples.pyexamples spawnuv run python— pin the venv interpreter once and invoke it directly to shave per-spawnuvresolution (~30-50 s across the suite).make lintmypy/pyright are non-incremental — add mypyincremental = true(and optionallydmypy) for 3-10× faster local re-lints (CI cold-cache unaffected).- CI e2e installs Playwright Chromium on every matrix leg — cache
~/.cache/ms-playwrightacross the v41/v42/v43 legs (~2-6 min/run). - Property-based testing on filter / order DSL parsing.
Long-term / exploratory¶
- Further
dhis2w-browserworkflows, layered onauthenticated_session: Maintenance app driving (actions that don't have REST), Org-unit-tree drag-drop edits. Dashboard creation is covered by the RESTDashboardsAccessor.add_item; layout drag-drop is UI-only but deferred until a concrete need appears. - Scheduled jobs plugin (
/api/jobConfigurations) — blocked on BUGS.md #15 (undiscriminatedjobParameters+WebMessage.responseunions). Revisit when the OAS discriminator is fixed upstream, or when a concrete scheduling workflow forces us to hand-roll typed payloads for the common job types. - Interactive aggregate-data-entry TUI —
d2w data entry <ds> <pe> <ou>launches a terminal spreadsheet bound to one data set × period × org unit. Questionary or textual for the UI; posts viaclient.data_values.streamon save. Powerful offline-capable data-entry fallback when the UI is down. dhis2w-chrome— local-LLM browser extension (PII-safe). A Chrome extension that drives DHIS2 from the browser using a local LLM — itfetches the user's on-box OpenAI-compatible endpoint (LM Studio / Ollama onlocalhost) instead of a cloud API, so patient/tracker data never leaves the machine. The competitive wedge over cloud-based competitor extensions: those are legally unusable for PII deployments (health-data law), where a local extension is the only option. Reuses the local-inference foundation (theModelBackendstory, model selection, the bridge's discovery lessons). Same decision boundary as everywhere else — local for PII, cloud for aggregate. Alternative in-browser routes (WebLLM/transformers.js via WebGPU; Chrome's built-in Prompt API / Gemini Nano) are weaker and parked. A new product surface (the repo hasdhis2w-browserfor Playwright automation, not an extension); post-1.0.- Router as the default MCP surface for all clients (cloud + local)? —
dhis2w-mcp-router(search+dispatch over upstream MCP servers; see surfaces + design) is a strong candidate to become the recommended entry point for everyone, not just small local models. It future-proofs against tool-surface growth (more DHIS2 tools never inflate the model's context — search is lazy), gives one chokepoint for read-only/host/audit policy, federates multiple servers, and offers typed discovery (validated:gemma-4-26b-a4b-qatdrove the full 311-tool surface through it at 16k context). But "for all" is not yet earned — it has real trade-offs vs connecting to the full server directly: asearch_toolsround-trip per task (extra latency a capable cloud model holding the full payload avoids), a proxy hop + failure point, and a hard dependence on search quality (keyword ranking is crude today — a missed search = a tool the model can't reach; the full surface has no "search missed it" failure mode). The decision is gated on data: thebench-routerlane (local models over router vs direct full-mcp vs bridge) plus embeddings-based ranking should settle whether router-for-all holds, or whether the honest answer is "router default for local + growing surfaces; direct full server as the low-latency escape hatch for capable cloud; bridge as the max-simplicity/max-security option." Don't promote it to default-for-all until the numbers say so — same measure-don't-assert discipline as the oracle. - Multi-backend
ModelBackend(Ollama / llama.cpp) — the local-model validation harness (packages/dhis2w-bench/src/dhis2w_bench/backend.py) abstracts model lifecycle (list / load / unload / server / chat-url) behind aModelBackendProtocol, withLmStudioBackendthe only implementation today (selectable viaMODEL_BACKEND). AddingOllamaBackend(auto-loads on first request, evicts by keep-alive, size via/api/tags) andLlamaCppBackend(one model perllama-serverprocess, or llama-swap) would let the benches drive any local runtime. The inference call is already portable (OpenAI-compatible/v1); only lifecycle differs. This is really a standalone-tool concern — the general bench +ModelBackend+bench-listare DHIS2-agnostic and could extract out of the workspace entirely (onlybench-bridgeis inherently dhis2w). Interesting side project; parked until a non-LM-Studio runtime is actually in play.
Testing roadmap¶
The unique shape of this project — we generate code from a moving REST API, then hand-write CLI / MCP / auth layers on top — dictates the testing surface. Bugs slip in at five layers, each best caught with a different tool.
Layered overview¶
| Layer | What can break | Today | Strongest tool |
|---|---|---|---|
| Static | Type errors, unused imports, dead code | ruff + mypy + pyright (good) | + add deptry for unused / missing deps |
| Unit | Pure logic, parsers, builders | ~1,100 tests, respx-mocked HTTP (good) | + property-based + mutation |
| Codegen | Generator emits wrong code | Snapshot tests on the emitted tree pin the diff per PR | + mutation tests on the templates |
| Schema contract | Generated code stops matching live API | @pytest.mark.contract suite hits play.im.dhis2.org/dev-2-{42,43} |
Widen to more resources + nightly cron |
| Live integration | End-to-end against real DHIS2 | E2E workflow matrix runs make test-slow against docker stack v41/v42/v43 |
Add a read-only per-PR contract pass |
| Examples | Documented usage drifts from reality | make verify-examples (nightly E2E) + check_example_refs.py resolves every example's CLI/MCP reference in fast CI |
Snapshot stdout for diff-against-baseline |
| Upstream bugs | Workaround breaks; fix lands and we don't notice | @pytest.mark.upstream_bug pairs bug-still-present + workaround halves |
Lifecycle automation: open issue when bug clears |
Tier A — high leverage, ~1 PR each¶
A1. Schema contract tests against the live play instances (per-PR, read-only). — shipped.
@pytest.mark.contract suite + .github/workflows/contract.yml cover representative resources against play.im.dhis2.org/dev-2-{42,43}. Each test fetches one real instance and runs it through the generated pydantic model, asserting it validates. Catches DHIS2 ship-day API changes before users do. Next iteration: widen the resource set + add a nightly cron alongside the PR-trigger.
A2. BUGS.md regression-suite scaffolding. — shipped.
@pytest.mark.upstream_bug marker pairs bug-still-present + workaround halves; see packages/dhis2w-client/tests/test_upstream_bugs.py. make test-upstream-bugs runs the whole catalogue. Next iteration: lifecycle automation (open a tracking issue when a bug-still-present test starts failing — the signal to delete the workaround).
A3. Multi-version CI matrix — shipped.
.github/workflows/e2e.yml runs nightly across dhis2_version: [41, 42, 43]. Each matrix job pulls the matching infra/v{N}/dump.sql.gz, brings up dhis2/core:{N}, seeds, and runs make test-slow. fail-fast: false so one version's hiccup doesn't cancel the other; per-job concurrency keyed on the matrix value so matrix jobs don't fight over the run-slot.
A4. Property-based tests for the parser-shaped code paths. Hypothesis is overkill for happy-path business logic but devastatingly effective for parsers. Targets:
generate_uid— distribution properties (no character bias, all 11 chars, 62-symbol alphabet).- Period parsing (
LAST_3_MONTHS,202403,2024Q1,2024S2,2024W12, …). - Filter DSL (
name:ilike:foo,code:in:[a,b], nestedattributeValues.attribute.id:eq:UID). - JSON Patch RFC 6902 round-trip — apply then invert; the composition should be a no-op.
- URL construction — no double-slashes, correct encoding,
.jsonsuffix on/api/analytics/*(BUGS.md #1).
One PR per parser, ~50 lines of hypothesis strategies + 5 properties each.
A5. Generated-code golden snapshots. — shipped.
packages/dhis2w-codegen/tests/test_snapshots.py loads each committed schemas_manifest.json, runs emit() + emit_from_openapi() into a tmp dir, and asserts byte-for-byte equality against the committed generated/v{N}/ tree. Parameterised over v41 / v42 / v43. CI fails the moment codegen drifts from the committed tree.
A6. Fill plugin coverage gaps (3–5 PRs of test writing). Two whole plugins + half a dozen CLIs are far below the 70 % workspace floor. The workspace gate stays green only because the well-covered codegen + client surface averages it out. Per-package gates (B2) would fail these immediately:
| Plugin / file | Current | Notes |
|---|---|---|
plugins/aggregate/cli.py |
33 % | Service is at 76 %; the CLI + MCP wrappers around it lack respx-driven coverage. |
plugins/aggregate/mcp.py |
33 % | Same gap, MCP side. |
plugins/dev/admin_auth.py |
24 % | Highest-priority — admin Basic-auth bootstrap, tested only by integration. |
plugins/dev/sample.py |
20 % | 442 LOC across five sub-modules, no respx tests on the sample-data emitters. |
plugins/dev/pat.py |
40 % | PAT mint / list / revoke through MCP. |
plugins/dev/oauth2.py |
55 % | OAuth2 client CRUD + Bearer-mint paths. |
plugins/tracker/cli.py |
22 % | The most surface-area CLI — register + enroll + event + relationship verbs. |
plugins/profile/cli.py |
25 % | Multi-flow CLI (basic / PAT / OAuth2 / OIDC) — most flows exercised only via the end-to-end example suite. |
plugins/route/cli.py |
37 % | /api/routes lifecycle wrappers. |
plugins/user/cli.py |
31 % | User CRUD verbs. |
plugins/user_group/cli.py |
26 % | UserGroup CRUD verbs. |
Each plugin gets one PR: respx-driven happy-path + error-shape tests at the service layer, typer.testing.CliRunner smoke tests for every CLI verb, in-process httpx.AsyncClient integration through the FastMCP server for the MCP wrappers. Estimate: 5–7 PRs total, ~3–4 days of focused work. Worth doing before pinning per-package gates (B2).
Tier B — medium leverage, ~2-3 PRs¶
B1. Mutation testing nightly.
mutmut or cosmic-ray against packages/dhis2w-client/src/ and packages/dhis2w-core/src/plugins/*/service.py. Surface mutations that survive — each survivor is either a missing test or dead code. Run weekly (it's slow); fail when survivor count goes up vs baseline.
B2. Per-package coverage gates.
make coverage is workspace-wide at 70 %. That hides the case where dhis2w-client is at 95 % and a peripheral plugin is at 30 %. Split into per-package thresholds; show a coverage diff in PR comments via codecov / coveralls / a simple gh-action. Pin dhis2w-client higher than the rest since it's the public-API surface.
B3. Tracker write end-to-end test suite.
Tracker is the most error-prone area (envelope shapes, atomic / non-atomic modes, importStrategy semantics, soft-delete behaviour). An integration suite that creates a tracked entity with enrollment + events, updates each via PATCH, deletes them, verifies cleanup. Run nightly across the matrix to catch tracker-specific drift between versions.
B4. MCP tool catalogue contract test. Walk every tool registered by FastMCP, assert:
- Tool input schema is valid JSON Schema.
- Docstring is non-empty.
- Tool name follows
<plugin>_<resource>_<verb>convention. - Return-type annotation is a
BaseModel.
Stops the MCP surface from quietly degrading (missing docstrings, untyped returns).
B5. Live-instance smoke tests against play, parallel matrix.
Beyond contract tests (A1) — actual d2w system whoami, d2w metadata list dataElements --limit 5, etc., run against play.im.dhis2.org/dev-2-{42,43} in parallel. Catches "we shipped a release that actually works against real DHIS2."
Tier C — exotic / specialty¶
C1. Snapshot example stdout.
make verify-examples reports PASS / FAIL but doesn't pin output. Add --snapshot mode that records stdout into examples/.snapshots/. CI fails when output drifts unexpectedly. Catches "still passes" examples that produce subtly different / wrong output.
C2. Schema drift watcher (weekly cron).
Cron job that runs d2w dev codegen diff against the live play instances. If the committed manifest no longer matches what live reports, post an issue. The "DHIS2 just shipped 2.43.2" early-warning system.
C3. Performance benchmarks + regression detection.
pytest-benchmark for:
- CLI startup time (already on roadmap, ~2 s today, target < 400 ms).
- MCP
list_toolslatency. - Generated-code import time (the 562 OAS classes pydantic-rebuilds).
- Bulk fetch (1 k metadata items).
Store baselines in CI; fail PRs that regress > 20 %.
C4. Hypothesis-driven fuzzer for the OAS generator.
Generate adversarial OpenAPI specs (deeply nested oneOf, missing discriminators, recursive refs). Run oas_emit against them; assert it doesn't crash; collect cases where it does. One-time investment that finds latent oas_emit.py bugs.
C5. Browser / UI tests. Playwright is a runtime dep (for screenshot capture, OIDC login automation), not a test surface. The screenshot output IS the test today — compare PNG to a golden — and that's enough.
What we're explicitly skipping¶
- Load testing. Not a server; the bottleneck is always the upstream DHIS2 instance, not our client. Premature.
- Contract testing via Pact / Schemathesis. The OpenAPI spec is too unreliable (BUGS.md #14, #15, #28 are spec-quality issues). Our own contract tests against live instances pay better.
- Hypothesis-jsonschema for the OAS models. Tempting, but the
extra="allow"shapes spin Hypothesis on impossible negative cases. - Mutation testing on generated code. Mechanically derived; mutations there don't tell us anything we can fix.
Recommended order¶
A3 is shipped (e2e.yml matrix runs across dhis2_version: [42, 43] nightly, v43 dump committed at infra/v43/dump.sql.gz). The remaining order:
- A1 — live-schema contract tests against play, per-PR. Cheapest highest-leverage thing in this list. Now first.
- A2 —
BUGS.mdregression suite scaffolding. Stops the manual BUGS retest cycles. - A4 + A5 — property-based + codegen snapshots. Independent; either order.
Tier B and C defer until A1–A5 are paying off.
Reference: dhis2-java-client¶
Apache-2.0 Java client maintained by the DHIS2 org (dhis2/dhis2-java-client). Targeted comparison against this workspace as of this writing:
Already covered here¶
- Typed
/api/sharing—Sharing,SharingBuilder,ACCESS_*constants,apply_sharing/get_sharinghelpers. Full parity with the Java client's sharing builder. - User administration —
d2w user list / get / me / invite / reinvite / reset-password. User-group + user-role plugins covering membership + authority-bundle flows. - Branding / theming —
d2w customize logo-front/banner/style/set/apply/show+Dhis2Client.customizeaccessor. No equivalent in the Java client. - Auth providers (Basic, PAT, OAuth2); ours is async-first with a typed
AuthProviderProtocol. - Generated resource CRUD across v41, v42, v43 (Java is hand-maintained).
- WebMessageResponse envelope parsing;
.import_count(),.conflicts(),.rejected_indexes(),.task_ref(),.created_uid(). - Full metadata query surface; repeatable
--filter,--order,rootJunction=AND|OR,--page/--page-size,--all,--translate/--locale, everyfieldsselector form. - Metadata bundle export / import / diff + RFC 6902 patch with per-resource filters + dangling-reference warning on export.
- Paging;
list_raw(..., paging=True)returns the pager;list(..., paging=False)walks the full catalog. - Typed filter values on enum fields;
ValueType.NUMBERis aStrEnum, substitutable into filter strings directly. - Client-side UID generation; matches the Java
CodeGeneratoralgorithm exactly. - Typed tracker writes;
TrackerBundle+TrackerTrackedEntity/TrackerEnrollment/TrackerEventmodels forPOST /api/tracker. - Event + enrollment analytics; outlier detection + tracked-entity analytics.
Considered, not adopted¶
- Fluent query builder (
.addFilter(Filter.eq("name", "Penta"))): the Java client wraps DHIS2'sproperty:operator:valuestring syntax in a chainable builder. Deliberately skipped — Python f-strings makef"name:like:{name}"already readable; the builder doesn't buy type safety on the stringly-typed value side; DHIS2's own docs teach the string form.
Worth evaluating later (Java parity)¶
- Domain-specific response types beyond
WebMessageResponse: Java has distinctPagedResponse,Stats,Responsefor different endpoint shapes. We collapse intoWebMessageResponse+ helpers. The OAS codegen already emits the specific shapes (TrackerImportReport,ImportReport, etc.) — swap on-demand when a specific call site hits friction.
Beyond Java parity (already shipped)¶
Items that don't exist in the Java client and now exist here:
- Retry / backoff —
RetryPolicyonDhis2Client+open_clientwith exponential backoff, jitter,Retry-Afterhonoured, idempotent-only by default. - Library-level task awaiter —
client.tasks.await_completion(task_ref, ...)+client.tasks.iter_notifications(...). - Connection-pool tuning —
http_limitskwarg onDhis2Clientandopen_client. - Typed codegen across five DHIS2 versions via schema-driven emission; Java is hand-maintained.
- OAS spec-patches framework — synthesises the Jackson discriminators DHIS2's OpenAPI generator omits (
Route.authet al.). - Data-integrity streaming iterator —
client.maintenance.iter_integrity_issues()yields a flat stream ofIntegrityIssueRows tagged with owning-check metadata. - System metadata cache — TTL-bounded in-memory cache on
client.systemforinfo()/default_category_combo_uid()/setting(key). - Bulk metadata delete —
client.metadata.delete_bulk(resource_type, uids)+delete_bulk_multi({...}). - Cross-resource metadata search —
client.metadata.search(query)returns typedSearchResultsgrouped by resource; handles UID / partial UID / code / name in one verb. - Typed Visualization + Map + Dashboard builders —
VisualizationSpec,MapSpec+MapLayerSpec,DashboardSlot. Chart-type-aware dimension placement, typed data dimensions (DEs + indicators),RelativePeriodenum for rolling windows, legend-set support. CLI + MCP surfaces on top. - Per-viz + per-dashboard PNG capture —
d2w browser viz screenshot+d2w browser dashboard screenshot+d2w browser map screenshot, Chromium-driven via Playwright session helpers. - Typed bulk-save on every generated resource —
client.resources.<resource>.save_bulk(items). Supportsimport_strategy+atomic_mode+dry_run. client.metadata.dry_run(by_resource)— cross-resourceimportMode=VALIDATEentry point.- Streaming analytics export —
client.analytics.stream_to(destination, *, params, endpoint="/api/analytics.json"). - Messaging plugin —
d2w messaging {list,get,send,reply,mark-read,mark-unread,delete}+messaging_*MCP tools +client.messagingaccessor. - Validation + predictors workflow —
d2w maintenance validation {run,result,validate-expression,send-notifications}+d2w maintenance predictors run. - Streaming dataValueSets import —
client.data_values.stream(source, content_type=...). - Multi-instance metadata diff —
d2w metadata diff-profilesexports two profiles concurrently + diffs them. - Files plugin — CLI + MCP +
client.filesaccessor over/api/documents+/api/fileResources. - SQL views runner —
client.sql_views+d2w metadata sql-views {list, get, execute, refresh, adhoc}. - Tracker authoring workflows —
d2w tracker {register, enroll, add-event, outstanding}verbs + the matchingclient.trackerhelpers for operator flows beyond generic CRUD. - Rich conflict renderer —
d2w metadata import/d2w data aggregate importrender/api/metadataand/api/dataValueSetserror envelopes as a normalisedConflictRowtable (object UID → offending property → server message). - Apps plugin —
d2w apps {list, add, remove, update, update --all, reload, snapshot, restore, hub-list, hub-url}+apps_*MCP tools +client.appsaccessor over/api/appsand/api/appHub.update --all --dry-runpreviews available hub updates before installing; bundled core apps update in place.hub-list --searchfilters the catalog client-side.hub-urlread/writes thekeyAppHubUrlsystem setting so self-hosted hubs can be wired via CLI.snapshot --outputpins an instance's app inventory to a portable JSON manifest;restore <manifest>reinstalls every hub-backed entry viainstall_from_hub, with a--dry-runpreview that mirrorsupdate --all --dry-run. - Metadata cross-instance merge —
d2w metadata merge <source-profile> <target-profile> --resource ... [--dry-run]orchestrates export+import in one pass, returning typed per-resource export counts plus the target import'sWebMessageResponse. Pairs withdiff-profiles(same resource+filter shape): diff to preview, merge to apply. Sharing blocks are stripped by default to avoid false-positive conflicts from per-instance user/group UIDs. Conflicts on the dry-run and applied paths render through the sharedConflictRowRich table used bymetadata import(#177), so preview output is immediately actionable without reaching for--json | jq. - Canonical
X / XGroup / XGroupSetauthoring triples — sub-apps underd2w metadata, one client accessor per resource, following a single canonical-naming rule (lowercase + hyphenate the DHIS2 resource path). Shipped fororganisation-units(#174),data-elements(#175),indicators(#176), andcategory-options(#181), plus theprogram-indicatorspair (#180 — DHIS2 has noprogramIndicatorGroupSet). Each PR adds 15–19 MCP tools, full CLI verbs (list/get/create/ rename-like / per-item membership), and hand-written accessors that return typed generated models. No*Specbuilders — keyword args on the accessor (continues the spec-audit data point). The indicator accessor exposesvalidate_expression(context="indicator"), the program-indicator accessorvalidate_expression(context="program-indicator"), so callers can pre-flight numerator / denominator / expression references before a failed create.category-optionsadditionally shipsset_validity_window(uid, start_date, end_date)for the validity-window knob unique to that resource. - Aggregate data-set authoring —
d2w metadata data-sets+sectionssub-apps (#185).DataSetElement+Section.dataElements[]are handled as join tables with round-trip helpers:add_element(ds_uid, de_uid, category_combo_uid=...)carries the per-set CC override;sections.reorder(section_uid, [de_uids])replaces the ordered DE list in one PUT. Docstring calls out the DSE self-ref strip for DHIS2's read/write asymmetry. - Validation-rule + predictor CRUD —
d2w metadata validation-rules+predictors+ their groups (#186). Closes the author-then-run gap —d2w maintenance validation run/predictors runshipped long ago, but rules + predictors themselves couldn't be authored from CLI. Surface assemblesleftSide/rightSide/generatorExpression sub-objects from plain kwargs. - Bulk RFC 6902 patch —
client.metadata.patch_bulk(resource, [(uid, ops), ...], concurrency=8)+patch_bulk_multi(...)(#187). Client-side fan-out under a semaphore; per-UID failures land inBulkPatchResult.failures(withuid/resource/status_code/message) instead of raising. Building block for future CLI-level bulk verbs. - Bulk sharing —
client.metadata.apply_sharing_bulk(resource_type, uids, sharing)+apply_sharing_bulk_multi(by_resource, sharing)fan out oneSharingBuilderpayload across many UIDs under a concurrency semaphore. CLI surface asd2w metadata share <type> [UID...]with--public-access/--user-access UID:access/--user-group-access UID:access(repeatable) + stdin UID input via-sometadata list ... \| jq -r .id \| xargs metadata sharecomposes. Per-UID failures land inBulkSharingResult.failureswith the same row-level table renderer used byrename/retag. - Category dimension authoring (complete end-to-end) —
d2w metadata categories(#205) +category-combos+ read-onlycategory-option-combos(#208) + the one-passCategoryComboBuilderhelper (#209). Categories accept ordered--option UIDflags on create + per-itemadd-option/remove-optionshortcuts. CategoryCombos accept ordered--category UIDflags + await-for-cocs --expected Nmatrix-poll barrier handling DHIS2's async COC regeneration (cold-start can take tens of seconds, especially under arm64 emulation). Thecategory-combos build --spec FILEverb walks a declarativeCategoryComboBuildSpec(JSON or stdin) and ensures every CategoryOption -> Category -> CategoryCombo exists; idempotent, returning a typedCategoryComboBuildResultwith per-layer created-vs-reused breakdown. - Bundle-source metadata merge —
d2w metadata merge-bundle <target> <bundle.json>(#206) imports a saved JSON bundle into a target profile. Sibling to the source-profilemergeverb; same--strategy/--atomic/--include-sharing/--dry-runknobs. Useful when the bundle came from a savedmetadata export, was hand-crafted, or was produced by a non-DHIS2 tool.MergeResult.source_base_urlisbundle:<path>for traceability. - Tracker-schema authoring (complete end-to-end) —
d2w metadata tracked-entity-attributes+tracked-entity-types(#188) covers the leaf resources;tracked-entity-types add-attribute --mandatory --searchableround-trips the TETA join table.d2w metadata programs {list, get, create, rename, add-attribute, remove-attribute, add-to-ou, remove-from-ou, delete}(#189) covers the middle layer — WITH_REGISTRATION / WITHOUT_REGISTRATION program flavours, PTEA enrollment form linkage, per-item OU shortcuts.d2w metadata program-stages {list, get, create, rename, add-element, remove-element, reorder, delete}(#194) covers the inner layer — each stage's orderedprogramStageDataElements[]join table withcompulsory/displayInReports/allowFutureDate/allowProvidedElsewhereflags. Documents the DHIS2mergeMode=REPLACErequirement on Program + ProgramStage PUT (nested-list removal is additive without it) as a typed client-side workaround. - Codegen + base-client gap closure (#190–#192, #197). Generated
create(item, *, merge_mode, import_strategy, skip_sharing, skip_translation)+update(item, ...)forward the write-flag query params. Every generated resource exposesadd_collection_item(parent_uid, collection, item_uid)/remove_collection_item(...)for per-item POST/DELETE shortcuts. BaseDhis2Clientships typedpost(path, body, model=T)+put(path, body, model=T)wrappers (parallels the existing typedget). Hand-written accessor sweep across 28+24+10 files replaced_put_with_replace/ per-item loops / duplicated_uid_from_webmessagehelpers / single-objectget_raw + model_validate/ pagedlist_allviaresources.X.list(...)with the new surface. ~700 lines of duplication removed with no behavior change. d2w metadata rename+metadata retagverbs — bulk CLI verbs on top ofclient.metadata.patch_bulk(#195, #199, #200).renamehandles label-field add / strip prefix + suffix (idempotent both directions — won't double-apply, won't no-op-fail).retaghandles ref-field rewrites (categoryCombo,optionSet,legendSets) + enum field rewrites (aggregationType,domainType). Both take--filter(repeatable, same DSL asmetadata list) +--dry-run+--concurrency. Per-UID failures land in the sharedConflictRowrenderer used bymetadata import, so operators see row-level detail on partial failures.- CI coverage gate + failure threshold (#196, #202).
make coveragereplacesmake testin the CI test step; every run uploadscoverage.xmlas an artifact retained 14 days, and fails the build if coverage drops under 70%. Current baseline 73% (85k statements / 7.5k branches). - Playwright-driven OIDC login —
d2w profile login --no-browserprints the auth URL for copy-paste;dhis2w_browser.drive_oauth2_login(profile, user, pw)drives the full flow via Chromium (React login → Spring AS consent → loopback redirect) for CI + headless use cases.examples/v42/cli/profile_oidc_login.sh+examples/v42/client/oidc_login.pyauto-dispatch to the Playwright path whenDHIS2_USERNAME/DHIS2_PASSWORDare in env. - Predictor + validation seed fixtures — the Sierra Leone play42 snapshot now ships 2 BCG predictors (
avg+sumover 3-month windows) + a PredictorGroup + 2 output DEs, plus 2 BCG validation rules + a ValidationRuleGroup that reliably produce violations.d2w maintenance predictors run --groupandd2w maintenance validation run --grouphave concrete targets out of the box. - Interactive CLI pickers —
d2w profile defaultlaunches an arrow-key menu viaquestionary.
Beyond Java parity (not yet)¶
(Empty — major Java-parity gaps are closed.)
Explicit non-goals¶
- Python < 3.13. New typing features (StrEnum, TypeAliasType, PEP 604 unions, PEP 695 generics) justify the bump.
- DHIS2 outside v41 / v42 / v43. Older DHIS2 majors and unreleased ones aren't on the support matrix; every backport fork splits the code with no deployed users to justify the split.
- Flask / argparse / raw stdio MCP loops / hand-rolled TOML parsers; every slot has a chosen standard per the CLAUDE.md hard-requirements list in the repo root.
- A second filter DSL layered on top of DHIS2's
property:operator:valuestring syntax. See the dhis2-java-client comparison above for the rationale. - Synchronous client variant.
asyncthroughout is a hard requirement. dict[str, Any]crossing module boundaries. CLAUDE.md hard rule; enforced workspace-wide as of the typing sweep (#71-#74, #76). New code that proposes dict-in-signature needs explicit justification referencing a specific HTTP-boundary carveout.d2w program-rule trace/ rule simulator — explicitly declined.
How this file gets updated¶
Greenfield voice; edits describe the current state of the plan, not its history. When a near-term item ships, delete it from the "near-term" list (don't rewrite to "already shipped"). Use the PR's own description for the history; this file is always about what's next.