While running a job in Hive or Spark a set of error messages are returned with the following components:
- FileNotFound for a folder named _temporary/
- java.net.SocketTimeoutException for a file from the _temporary/ folder
This can be caused by having mapreduce.use.directfileoutputcommitter = false
Set the following values:
Keep in mind that if the job fails mid way the subsequent reruns may require manual clean up.