airtunnel.operators package¶
Subpackages¶
Submodules¶
airtunnel.operators.archival module¶
Airtunnel operators for archival tasks.
-
airtunnel.operators.archival.
DataAssetArchiveOperator
(asset: airtunnel.data_asset.BaseDataAsset, *args, **kwargs)¶ Prepares a new archive for a data asset and returns the name of the folder.
-
airtunnel.operators.archival.
IngestArchiveOperator
(asset: airtunnel.data_asset.BaseDataAsset, *args, **kwargs)¶ Simply move from ingest/landing to ingest/archive
airtunnel.operators.ingestion module¶
Airtunnel operators for ingestion tasks.
-
airtunnel.operators.ingestion.
IngestOperator
(asset: airtunnel.data_asset.BaseDataAsset, metadata_adapter: airtunnel.metadata.adapter.BaseMetaAdapter = None, *args, **kwargs)¶ Airtunnel’s ingestion operator. First inspects source files (size, create & modification dates) and logs metadata using Airtunnel’s MetaAdapter. Then moves (“ingests”) these files to the staging/pickedup directory for the data asset at hand.
If the staging/pickedup directory of the data store is not empty (indicating a previous ingestion job has failed), this operator will also fail and report the problem, to avoid data loss.
The list of files to inspect & ingest is retrieved using Airflow XCOM data, which a
airtunnel.sensors.ingestion.SourceFileIsReadySensor
is expected to have provided beforehand.
airtunnel.operators.loading module¶
Airtunnel operators for loading (i.e. getting from staging/ready to ready in the data store) tasks.
-
airtunnel.operators.loading.
StagingToReadyOperator
(asset: airtunnel.data_asset.BaseDataAsset, *args, **kwargs)¶ Airtunnel’s StagingToReadyOperator – moves staged files (from staging/read) to ready for a data asset and write load status metadata.
airtunnel.operators.transformation module¶
Airtunnel operators for transformation tasks.
-
class
airtunnel.operators.transformation.
PandasTransformationOperator
(asset: airtunnel.data_asset.PandasDataAsset, *args, **kwargs)¶ Bases:
airflow.models.baseoperator.BaseOperator
Airtunnel’s transformation operator for PandasDataAssets, calling
rebuild_for_store()
on them.-
execute
(context)¶ Execute this operator using Airflow.
-
ui_color
= '#ffff00'¶
-
-
class
airtunnel.operators.transformation.
PySparkTransformationOperator
(asset: airtunnel.data_asset.PySparkDataAsset, *args, **kwargs)¶ Bases:
airflow.models.baseoperator.BaseOperator
Airtunnel’s transformation operator for PySparkDataAssets, calling
rebuild_for_store()
on them.-
execute
(context)¶ Execute this operator using Airflow.
-
ui_color
= '#ffff00'¶
-
-
class
airtunnel.operators.transformation.
SQLTransformationOperator
(asset: airtunnel.data_asset.SQLDataAsset, parameters: dict = None, dynamic_parameters: False = None, *args, **kwargs)¶ Bases:
airflow.models.baseoperator.BaseOperator
Airtunnel’s transformation operator for SQLDataAssets, calling
rebuild_for_store()
on them.Can be customized with static and dynamic parameters from the DAG definition/creation time, i.e. using the operator’s constructor.
-
execute
(context)¶ Execute this operator using Airflow.
-
ui_color
= '#ffff00'¶
-