Error: FileNotFound & SocketTimeOut for /temporary folder & files

Symptom

While running a job in Hive or Spark a set of error messages are returned with the following components:

  • FileNotFound for a folder named _temporary/
  • java.net.SocketTimeoutException for a file from the _temporary/ folder

Cause

This can be caused by having mapreduce.use.directfileoutputcommitter = false

Action

Set the following values:

  • mapred.output.committer.class=org.apache.hadoop.mapred.DirectFileOutputCommitter
  • mapreduce.use.directfileoutputcommitter=true

Keep in mind that if the job fails mid way the subsequent reruns may require manual clean up. 

Have more questions? Submit a request

Comments

Powered by Zendesk