How To: Presto Tuning

Presto SQL

Presto is ANSI SQL Compliant therefore users and developers can take advantage of traditional SQL Syntax which has historically been executed against RDBMS (Relational Database Management Systems). When working with Presto there are a few considerations to keep in mind to ensure successful and optimized execution of queries.

Parallelism Tuning

If the Split Spots on the node are not properly configured users may see jobs lag due to resource conflict. When all of the available Split Slots in the cluster are occupied incoming jobs will be delayed until the resources become free. If the Split Size Allocation is not sufficient to store the dataset then the incoming jobs may fail if the cluster is not able to grow any further.

Number of Splits in Cluster

node-scheduler.max-splits-per-node * Number of Nodes

Initial Split Size Allocation for Dataset

hive.max-initial-splits * hive.max-initial-split-size

Maximum Split Size Allocation for Dataset

hive.max-initial-splits * hive.max-split-size

Presto Join Syntax

When resolving SQL Joins Presto will take the table on the right side of the Join Clause and hash this table in Memory on the column(s) that are used for the join. As a result when executing SQL in Presto it is important to make sure that the smallest table appears on the right side of the Join Clause. If the larger table is on the right side of the join clause the SQL will still execute however there is a higher likelihood of sluggishness during execution.

Presto Table Sizes

Given that Presto has been designed for Ad-Hoc SQL Queries it performs best when querying tables that are partitioned and the best performance for Presto is achieved when less than 1 TB is brought into the cluster at a time. While Presto can manage up to 100 TB of data in the cluster at a time it is a best practice to use less than 1 TB.

Presto File Formats

Presto performs best with columnar formats therefore to achieve the best performance in Presto the source data should be in a columnar format such as ORC or Parquet. Conversion of the data into these formats prior to execution will ensure the most optimal runtime of the queries. In this case the burden is placed on the ETL that is building the files that are used for the Ad-Hoc queries instead of putting the burden on the Ad-Hoc queries themselves.

Presto Execution Optimizations

During execution Presto can take advantage of Vectorization in addition to Rubix Caching to further improve query performance. These optimizations are sophisticated and the greatest improvements are observed when taking advantage of these optimizations. To ensure that these optimizations are leveraged during execution in addition to leveraging column formats such as ORC and Parquet the data should be partially or fully sorted. If the source data is partially or fully sorted then these optimizations will provide the best possible Presto performance.

Memory Tuning

Presto features three Memory Pools to manage the available resources and two of the Memory Pools are responsible for data processing. All queries are initially submitted to the General Memory Pool however if the memory limit is exceeded then queries will get blocked until other queries finish and release the memory. If the General Memory Pool limit is reached Presto will move the biggest query running at the moment to Reserved Pool.

maxHeap

The JVM container size.

Defaults to up to 70% of Instance Memory

System Memory

The overhead allocation.

Defaults to 40% of maxHeap

Reserved Memory

Reserved for Largest Job

query.max-memory-per-node

General Memory

First stop for all Jobs

MaxHeap- Reserved Memory

Query Memory

Max memory for the Job

set session query_max_memory

Query Memory Limit

Memory used across Nodes

query.max-memory

Avoiding Issues

Presto will delay jobs when there are not enough Split Slots to support the Dataset and jobs will Fail when there is not enough memory to process the query. If any of the below apply to the current environment then the configuration is not powerful enough and users can expect job lag and failure.

Reserved Memory * Number of Nodes < Peak Job Size

Use larger instance.

General Memory * Number of Nodes < Average Job Size * Concurrent Jobs

Use larger instance.

Reserved Memory * Number of Nodes < Query Memory

Adjust Setting.

Reserved Memory * Number of Nodes < Query Memory Limit

Adjust Setting.

Have more questions? Submit a request

Comments

Powered by Zendesk