etlplus.ops.transform

etlplus.ops.transform module.

Helpers to filter, map/rename, select, sort, aggregate, and otherwise transform JSON-like records (dicts and lists of dicts).

The pipeline accepts both string names (e.g., "filter") and the enum PipelineStep for operation keys. For operators and aggregates, specs may provide strings (with aliases), the corresponding enums OperatorName / AggregateName, or callables.

Examples

Basic pipeline with strings:

ops = {
    'filter': {'field': 'age', 'op': 'gte', 'value': 18},
    'map': {'first_name': 'name'},
    'select': ['name', 'age'],
    'sort': {'field': 'name'},
    'aggregate': {'field': 'age', 'func': 'avg', 'alias': 'avg_age'},
}
result = transform(data, ops)

Using enums for keys and functions:

from etlplus.ops import (
    PipelineStep,
    OperatorName,
    AggregateName,
)
ops = {
    PipelineStep.FILTER: {
        'field': 'age', 'op': OperatorName.GTE, 'value': 18
    },
    PipelineStep.AGGREGATE: {
        'field': 'age', 'func': AggregateName.AVG
    },
}
result = transform(data, ops)

Functions

apply_aggregate(records, operation)

Aggregate a numeric field or count presence.

apply_filter(records, condition)

Filter a list of records by a simple condition.

apply_map(records, mapping)

Map/rename fields in each record.

apply_select(records, fields)

Keep only the requested fields in each record.

apply_sort(records, field[, reverse])

Sort records by a field.

transform(source[, operations])

Transform data using optional filter/map/select/sort/aggregate steps.