This document provides tips that might help starting off with Eventual Consistently and how to manage this with Qubole. Amazon S3 buckets in the US Standard region provide eventual consistency and Amazon S3 buckets in all other regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.
- From FAQs at AWS http://aws.amazon.com/s3/faqs/:
- Eventual Consistency -- http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel
- Regions - http://docs.aws.amazon.com/general/latest/gr/rande.html
- Best Practices - http://docs.aws.amazon.com/redshift/latest/dg/managing-data-consistency.html
The following options are available in Qubole for reducing the probability of Eventual Consistency.
- Add the following parameter to the Hive Bootstrap for support with HiveSQL
set qubole.s3.standard.endpoint=<s3 end point from the articles above based on the region as applicable>;
- Add the following parameter to the Hadoop Cluster Override in the Cluster Settings:
qubole.s3.standard.endpoint=<s3 end point from the articles above based on the region as applicable>