Contributing

The full contribution guide lives in the repository’s CONTRIBUTING.md. This page highlights the parts that matter most when you are improving the published docs or the API reference.

Typing Philosophy

We optimize for a great editor experience and a permissive runtime:

  • TypedDicts in etlplus/*/types.py (for example etlplus/api/types.py) are editor/type-checking hints. They are intentionally total=False (all keys optional) and are not enforced at runtime.

  • Constructors named *.from_obj accept Mapping[str, Any] and perform tolerant parsing and light casting. This keeps runtime permissive while improving autocomplete and static analysis.

  • Prefer Mapping[str, Any] for inputs and plain dict[...] for internal state/returns. Avoid tight coupling to simple alias types.

  • Use Python 3.13 conveniences when helpful: Self return type in classmethods, dict union operators (|, |=), and modern typing.

  • Provide @overload signatures to narrow inputs (e.g., str vs Mapping; or specific TypedDict shapes). Import these shape types only under TYPE_CHECKING to avoid runtime import cycles.

  • Keep behavior backward compatible and permissive: do not reject unknown keys; pass through provider-specific blocks as Mapping[str, Any].

  • Tests: add small, focused unit tests for new constructors and merge logic (happy path plus 1–2 edge cases).

Example (type-only import + overload):

from typing import TYPE_CHECKING, overload
from collections.abc import Mapping
from typing import Any, Self

if TYPE_CHECKING:
      from .types import ExampleConfigMap

class ExampleConfig:
      @classmethod
      @overload
      def from_obj(cls, obj: 'ExampleConfigMap') -> Self: ...

      @classmethod
      @overload
      def from_obj(cls, obj: Mapping[str, Any]) -> Self: ...

      @classmethod
      def from_obj(cls, obj: Mapping[str, Any]) -> Self:
            if not isinstance(obj, Mapping):
                  raise TypeError('ExampleConfig must be a mapping')
            # parse fields permissively here
            ...

Documentation Style

Use NumPy-style docstrings for public APIs.

  • Keep full sections for public functions and methods: Parameters, Returns, and Raises when applicable.

  • Do not collapse public API docstrings to one-liners solely because the function is a thin delegator.

  • For deprecated wrappers, include an explicit deprecation sentence near the top, while still keeping the full structured sections.

  • For etlplus/file module-level wrappers (read/write) specifically, preserve full Parameters and Returns sections.

Local Quality Gates

ETLPlus maintains one supported contributor local quality workflow. Contributors should use:

make fmt
make lint
make doclint
make typecheck
make test

For local Git hooks, ETLPlus uses a staged workflow:

pre-commit install --install-hooks
  • pre-commit runs the fast hygiene and auto-fix hooks before a commit is created.

  • commit-msg validates the commit message format.

  • pre-push runs make check-pre-push, which is the mandatory local push gate for linting, docstring linting, type-checking, and the default non-perf test suite before a branch is pushed.

  • manual includes make check-ci-local, an opt-in local CI-parity pass for contributors who want a heavier pre-PR run covering the default non-perf CI-parity suite, non-HTML docs builds, and distribution validation.

You can invoke the heavier local CI-parity path explicitly with:

pre-commit run make-check-ci-local --hook-stage manual

Or directly through make:

make check-ci-local
  • Ruff is the authoritative source-code lint gate for imports, pyupgrade-style rewrites, and core static checks.

  • .ruff.toml is the canonical source for the Python line-length policy, currently 88 characters.

  • Any duplicated width setting in supporting tooling must match .ruff.toml.

  • autopep8 is retained only as a compatibility formatter for the existing make fmt, CI, and pre-commit workflow.

  • pydocstyle and pydoclint remain separate because they enforce the NumPy-style public-docstring contract rather than general source formatting.

  • mypy remains the type gate for the shipped package surface.

  • ETLPlus does not maintain separate Black or Flake8 contributor flows.

  • If an external editor or integration still invokes Flake8, use the repository .flake8 file only as a compatibility shim for overlapping basics such as line length and excludes; Ruff remains canonical.

Testing

Scope and Intent

Use these guidelines to decide where tests live and how to label intent.

  • Meta tests (put under tests/meta/):

    • Guard repository conventions, public-surface promises, documentation tables, and test-layout policy.

    • Keep them fast and deterministic; they should inspect source, docs, metadata, or test structure rather than product runtime behavior.

    • Use the test_m_*.py naming convention.

  • Unit tests (put under tests/unit/):

    • Exercise a single function or class directly in isolation (no orchestration across modules).

    • Avoid real file system or network I/O; use tmp_path for local files and stubs/mocks for external calls.

    • Fast and deterministic; rely on monkeypatch to stub collaborators.

    • Use the test_u_*.py naming convention.

    • Examples in this repo:

      • Small helpers in etlplus.utils

      • Validation and transform functions.

  • Integration tests (put under tests/integration/):

    • Exercise end-to-end flows across modules and boundaries.

    • Can use CLI argv, temporary files/directories, and stub network with fakes/mocks.

    • Use the test_i_*.py naming convention.

    • Examples in this repo:

      • CLI main() end-to-end

      • run() pipeline orchestration

      • File connectors

      • API client pagination wiring/strategy

      • Runner defaults for pagination/rate limits

      • Target URL composition.

  • E2E tests (put under tests/e2e/):

    • Validate full system-boundary workflows.

    • Keep slower, higher-scope checks here.

    • Use the test_e_*.py naming convention.

  • Smoke tests are an intent marker, not a scope folder:

    • Place smoke tests in tests/unit/, tests/integration/, or tests/e2e/ based on scope.

    • Mark them with @pytest.mark.smoke.

    • tests/smoke/ is a transitional legacy path during migration.

If a test calls etlplus.cli.main() or etlplus.ops.run.run(), it is integration by default.

Test Configuration

  • Each test folder should include a conftest.py for shared fixtures.

  • Use scope markers (meta, unit, integration, e2e) and intent markers (smoke, contract) from pytest.ini.

  • Add @pytest.mark.smoke and @pytest.mark.contract directly on modules/tests where intent applies.

  • Markers are declared in pytest.ini. Avoid introducing ad-hoc markers without adding them there.

  • For optional dependencies, prefer pytest.importorskip("module") so tests skip cleanly when the extra isn’t installed.

  • The default install includes the dependencies used by the built-in file handlers that are covered by the default test matrix.

  • For full local/CI coverage across the remaining scientific and specialty optional formats, install pip install -e ".[dev,file]".

Running Tests

Common commands:

  • Install dependencies:

    • pip install -e ".[dev]" (lightweight local development)

    • pip install -e ".[dev,file]" (full remaining optional-format CI parity)

    • uv sync --locked --extra dev (locked contributor environment from uv.lock)

    • uv sync --locked --extra dev --extra file (locked optional-format CI parity)

  • Run everything:

    • pytest

    • make test-full (installs .[dev,file] and runs pytest)

  • Run a specific suite:

    • pytest -m meta

    • pytest -m unit

    • pytest -m integration

    • pytest -m e2e

    • pytest -m smoke

    • pytest -m contract

  • Run a specific file or test:

    • pytest tests/meta/test_m_contract_readme.py

    • pytest tests/unit/file/test_u_file_core.py

    • pytest tests/unit/file/test_u_file_core.py::TestFile::test_roundtrip_by_format

  • Run by keyword:

    • pytest -k "roundtrip"

Common Patterns

  • CLI tests: monkeypatch sys.argv and call etlplus.cli.main(); capture output with capsys.

  • File I/O: use tmp_path / TemporaryDirectory(); never write to the repo tree.

  • API flows: stub EndpointClient or transport layer via monkeypatch to avoid real HTTP.

  • Runner tests: monkeypatch load_config to inject an in-memory Config.

  • Keep tests small and focused; prefer one behavior per test with clear assertions.

Docs build parity

pip install -e ".[dev,file,docs]"
make docs-strict

Community and support