airtunnel.operators package

Submodules

airtunnel.operators.archival module

Airtunnel operators for archival tasks.

airtunnel.operators.archival.DataAssetArchiveOperator(asset: airtunnel.data_asset.BaseDataAsset, *args, **kwargs)

Prepares a new archive for a data asset and returns the name of the folder.

airtunnel.operators.archival.IngestArchiveOperator(asset: airtunnel.data_asset.BaseDataAsset, *args, **kwargs)

Simply move from ingest/landing to ingest/archive

airtunnel.operators.ingestion module

Airtunnel operators for ingestion tasks.

airtunnel.operators.ingestion.IngestOperator(asset: airtunnel.data_asset.BaseDataAsset, metadata_adapter: airtunnel.metadata.adapter.BaseMetaAdapter = None, *args, **kwargs)

Airtunnel’s ingestion operator. First inspects source files (size, create & modification dates) and logs metadata using Airtunnel’s MetaAdapter. Then moves (“ingests”) these files to the staging/pickedup directory for the data asset at hand.

If the staging/pickedup directory of the data store is not empty (indicating a previous ingestion job has failed), this operator will also fail and report the problem, to avoid data loss.

The list of files to inspect & ingest is retrieved using Airflow XCOM data, which a airtunnel.sensors.ingestion.SourceFileIsReadySensor is expected to have provided beforehand.

airtunnel.operators.loading module

Airtunnel operators for loading (i.e. getting from staging/ready to ready in the data store) tasks.

airtunnel.operators.loading.StagingToReadyOperator(asset: airtunnel.data_asset.BaseDataAsset, *args, **kwargs)

Airtunnel’s StagingToReadyOperator – moves staged files (from staging/read) to ready for a data asset and write load status metadata.

airtunnel.operators.transformation module

Airtunnel operators for transformation tasks.

class airtunnel.operators.transformation.PandasTransformationOperator(asset: airtunnel.data_asset.PandasDataAsset, *args, **kwargs)

Bases: airflow.models.baseoperator.BaseOperator

Airtunnel’s transformation operator for PandasDataAssets, calling rebuild_for_store() on them.

execute(context)

Execute this operator using Airflow.

ui_color = '#ffff00'
class airtunnel.operators.transformation.PySparkTransformationOperator(asset: airtunnel.data_asset.PySparkDataAsset, *args, **kwargs)

Bases: airflow.models.baseoperator.BaseOperator

Airtunnel’s transformation operator for PySparkDataAssets, calling rebuild_for_store() on them.

execute(context)

Execute this operator using Airflow.

ui_color = '#ffff00'
class airtunnel.operators.transformation.SQLTransformationOperator(asset: airtunnel.data_asset.SQLDataAsset, parameters: dict = None, dynamic_parameters: False = None, *args, **kwargs)

Bases: airflow.models.baseoperator.BaseOperator

Airtunnel’s transformation operator for SQLDataAssets, calling rebuild_for_store() on them.

Can be customized with static and dynamic parameters from the DAG definition/creation time, i.e. using the operator’s constructor.

execute(context)

Execute this operator using Airflow.

ui_color = '#ffff00'

Module contents

Package for Airtunnel’s custom operators.

class airtunnel.operators.Colours

Bases: object

Custom colours used for the Airtunnel operators.

archival = '#85d8ff'
ingestion = '#aeffae'
loading = '#ffb3b1'
transformation = '#ffff00'