Error: FileNotFound & SocketTimeOut for /temporary folder & files


While running a job in Hive or Spark a set of error messages are returned with the following components:

  • FileNotFound for a folder named _temporary/
  • for a file from the _temporary/ folder


This can be caused by having mapreduce.use.directfileoutputcommitter = false


Set the following values:

  • mapred.output.committer.class=org.apache.hadoop.mapred.DirectFileOutputCommitter
  • mapreduce.use.directfileoutputcommitter=true

Keep in mind that if the job fails mid way the subsequent reruns may require manual clean up. 

Have more questions? Submit a request


Powered by Zendesk