Apache Airflow DAG

From GM-RKB
Jump to navigation Jump to search

An Apache Airflow DAG is a automated workflow DAG used by Apache Airflow.



References

2021

  • [Do not perform data processing in DAG files]."
    • QUOTE: ... Since DAGs are python-based, we will definitely be tempted to use pandas or similar stuff in DAG, but we should not. Airflow is an orchestrator, not an execution framework. All computation should be delegated to a specific target system. Follow the fire and track approach. Use the operator to start the task and the sensor to track the completion. Airflow is not designed for long-running tasks.

2018

  • https://medium.com/@chandukavar/testing-in-airflow-part-1-dag-validation-tests-dag-definition-tests-and-unit-tests-2aa94970570c
    • QUOTE: ... In Airflow, a DAG– or a Directed Acyclic Graph – is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies. ...
    • DAG validation tests are common for all the DAGs in Airflow, hence we don’t need to write a separate test for each DAG. This test will check the correctness of each DAG. It will also check whether a graph contains cycle or not. Tests will fail even if we have a typo in any of the DAG. Moreover, if we want to enforce developers to add certain default arguments to each DAG, we can write a test around that as well. Here are few validation tests ...