How To: Use Spark Logs

Where are the spark logs?

1. hdfs://master:9000/spark-history - this contains one file per spark app and it
   contains the spark eventlogs (logs needed to show the spark UI for one app).
2. /media/ephemeral0/logs/spark/autoscaling.log - This file is present on all
   cluster nodes. It contains the autoscaling logs for all spark programs whose
   driver ran on that node.
3. /media/ephemeral0/logs/spark/spark-yarn-org.apache.spark.deploy.history.HistoryServer-1-localhost.localdomain.log
   This file is only present on the cluster master node. It contains the spark history server logs.
4. If you hit the jobtracker UI, then you will be taken to the spark app's web UI. On the executors
   tab, click on stdout and stderr of each executor to see the container logs
   
How do I debug spark autoscaling?

/media/ephemeral0/logs/spark/autoscaling.log - Spark level autoscaling logs on each node
/media/ephemeral0/logs/yarn/autoscaling.log - RM autoscaling logs on master node
/media/ephemeral0/logs/yarn/scaling.log - RM autoscaling logs on master node for UI consumption
 
How do I debug notebooks?

If your paragraph is hanging
    Try hitting the CANCEL button.
    If that does not work, go to Interpreters page and restart the relevant interpreter
    If that also fails, contact help@qubole.com for assistance
Generally look at /usr/lib/zeppelin/logs for zeppelin as well as spark logs
The same logs are also backed up to S3 under s3://<Account default location>/logs/hadoop/1234/56789/ec2-*.us-west-1.compute.amazonaws.com.master/zeppelin/logs
    1234 is the cluster id
    56789 is the cluster instance id
    ec2-*.us-west-1.compute.amazonaws.com - is a sample public dns name of the master node of the cluster
    There are several files there. Some are for the zeppelin daemon itself and others are for the remote spark interpreter started by zeppelin.

Have more questions? Submit a request

Comments

Powered by Zendesk