When they are done, you can click on the hello_task and then click View Log. You can reload the graph view until both tasks reach the status Success. In order to run your DAG, open a second terminal and start the Airflow scheduler by issuing the following commands: $ cd /path/to/my/airflow/workspace This file creates a simple DAG with just two operators, the DummyOperator, which does nothing and a PythonOperator which calls the print_hello function when its task is executed. airflow_homeġ 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17įrom datetime import datetime from airflow import DAG from _operator import DummyOperator from _operator import PythonOperator def print_hello (): return 'Hello world!' dag = DAG ( 'hello_world', description = 'Simple tutorial DAG', schedule_interval = '0 12 * * *', start_date = datetime ( 2017, 3, 20 ), catchup = False ) dummy_operator = DummyOperator ( task_id = 'dummy_task', retries = 3, dag = dag ) hello_operator = PythonOperator ( task_id = 'hello_task', python_callable = print_hello, dag = dag ) dummy_operator > hello_operator The database will be create in airflow.db by default. Next step is to issue the following command, which will create and initialize the Airflow SQLite database: (venv) $ airflow initdb Take a look at the docs for more information about configuring Airflow. If the airflow version command worked, then Airflow also created its default configuration file airflow.cfg in AIRFLOW_HOME: airflow_homeĭefault configuration values stored in airflow.cfg will be fine for this tutorial, but in case you want to tweak any Airflow settings, this is the file to change. Let’s try by issuing the following: (venv) $ airflow version You should now be able to run Airflow commands. (venv) $ export AIRFLOW_HOME=`pwd`/airflow_home Once the directory is created, set the AIRFLOW_HOME environment variable: (venv) $ cd /path/to/my/airflow/workspace Now we’ll need to create the AIRFLOW_HOME directory where your DAG definition files and Airflow plugins will be stored. Now let’s install Airflow 1.8: (venv) $ pip install airflow=1.8.0 Let’s create a workspace directory for this tutorial, and inside it a Python 3 virtualenv directory: $ cd /path/to/my/airflow/workspace I will also assume that you have virtualenv installed. I’m using Python 3 (because it’s 2017, come on people!), but Airflow is supported on Python 2 as well. PrerequisitesĪirflow is written in Python, so I will assume you have it installed on your machine. AIRFLOW_HOME is the directory where you store your DAG definition files and Airflow plugins.Īirflow documentation provides more information about these and other concepts.When a Task is executed in the context of a particular DAG Run, then a Task Instance is created.When a DAG is started, Airflow creates a DAG Run entry in its database.A configured instance of an Operator becomes a Task, as in: my_task = MyOperator(.). Each Task is created by instantiating an Operator class.Here is a brief overview of some terms used when designing Airflow workflows: have the ability to be applied multiple times without producing unintended consequences. When designing Airflow operators, it’s important to keep in mind that they may be executed more than once.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |