Basic Airflow Troubleshooting in QDS

Description:

If you are configuring an Airflow cluster in QDS, here are some things to check while troubleshooting:

  1. Airflow installation - Airflow is installed inside a virtual environment location at “/usr/lib/python27/virtualenv”. Airflow requires python 2.7 which is available only in this virtualenv inside Qubole Cluster AMIs. Activate the virtualenv before invoking the airflow cli.

  2. Location of service logs - Logs of Airflow services e.g scheduler, webserver, celery etc. are available at “/media/ephemeral0/logs/airflow”. These services are instantiated during cluster bringup, so these logs can be referred for any troubleshooting while bringing up the cluster.

  3. Airflow Home - An environment variable $AIRFLOW_HOME is permanently set to “/usr/lib/airflow” for all machine users. Airflow configuration file (airflow.cfg), dags ( “/dags” folder), logs (“ /logs” folder) are present inside AIRFLOW_HOME folder. Please note that logs of the jobs triggered by airflow are available at $AIRFLOW_HOME/logs

  4. Restarting Airflow scheduler - 
    1. Become root - sudo su
    2. Activate Virtualenv - source /usr/lib/virtualenv/python27/bin/activate
    3. Stop Scheduler Process  - /usr/lib/hustler/bin/airflow/monit.sh stop /usr/lib/airflow/scheduler.pid
      This may give error if process does not exist
    4. Start Scheduler Process - /usr/lib/hustler/bin/airflow/monit.sh start /usr/lib/airflow/scheduler.pid
    Please note steps 3 & 4 will be updated in an upcoming release (we will replace it with generic script which can be used to start / stop any service).
 
Have more questions? Submit a request

Comments

Powered by Zendesk