Qubole Release Notes 24-Aug-2016

Release Version: 36.0.0

For details of what has changed in this version of QDS, see What is New and List of Changes and Bug Fixes in this Release.


What is New

Container Packing in Hadoop 2

Qubole has introduced container packing in Hadoop 2, which when enabled results in better downscaling. It makes the scheduler to pack containers on a smaller subnet of nodes instead of scattering them across all nodes of the cluster. This increases the probability of some nodes remaining unused and therefore such unused nodes become eligible for downscaling. For more information, see Container Packing in Hadoop 2.

New Features/Enhancements in Spark

The new feature/enhancements in Spark are:
Spark 2.0.0 and 1.6.2 versions are available with this QDS release.
Spark autoscaling improvements, fine-grained downscaling, downscale cached executors after idle time out, and support for open source dynamic allocation configurations, and other improvements are now available in Qubole Spark.
Spark external shuffle service is enabled by default for Spark clusters only in Spark 1.6.1 and Spark 1.6.2 versions.


UI Enhancements

These are the UI enhancements:

The name field for a command can be cleared out.
Instance Profile of a cluster can now be edited from the cluster configuration UI in the Control Panel.
Tags are now supported in Templates.

 

 

New Beta Features

 

New Interpreter Mode Configuration Option in the Spark Cluster UI Configuration

Qubole has now introduced a new Interpreter mode configuration option. It is aimed at allowing a fair distribution of cluster resources among different users. This feature is available for beta access. Contact help@qubole.com to enable this feature on an account.

 

Presto Ruby Client

 

Presto Ruby client replaces the current approach of connecting to the master node through SSH, starting a Presto-client JVM, and submitting query to Presto Server.
With the Presto Ruby client, Qubole's webservers directly submit commands to the Presto server running in the master node via REST calls.


This feature is available for beta access. Contact help@qubole.com to enable this feature on an account.

The feature provides multiple benefits that are listed below:

  • Faster query submission (improvement in overall time), you save up on the time it takes to SSH and spins up the client JVM.
  • Feedback loop is faster and you get to know about wrong queries much faster.
  • Higher concurrent query support. Today, concurrency is also limited by the number of client JVMs that can run inside master. Presto Ruby client removes the limitation as the clients are now outside the master node and distributed across various Qubole's webservers. 

 

List of Changes and Bug Fixes in this Release


AWS CLUSTER MANAGEMENT

Fix    ACM-446: Fixes failure in cluster start if a persistent security group with same name exists in multiple VPCs.
Fix   ACM-482: Users will be notified about cluster failures due to the client-side AWS errors, which can be easily resolved. The errors for which notification will be sent out are InsufficientFreeAddressesInSubnet, InstanceLimitExceeded, VPCIdNotSpecified, InsufficientInstanceCapacity, InvalidSubnetID,NotFound, PendingVerification and Unsupported (instance type not available in requested availability zone). Contact help@qubole.com to enable this feature for their account.
Change   ACM-312: Expose API endpoints to SET/GET SSH key pair at the account level and GET the key pair cluster-instance level.

 


HADOOP-2

Fix   HADTWO-401: Moved non error logs to INFO level in NativeS3FileSystem.

Fix   HADTWO-408: Enabling reservation only when we are at maximum nodes so that autoscaling gets higher priority.
Fix   HADTWO-421: Added assume role support in S3A filesystem. With this change, Hive queries will run with S3A filesystem on an IAM-Roles-based account.
Fix   HADTWO-470: Qubole has now reduced the maximum number of retries that an ApplicationMaster can make and the timeout interval that an IPC client can make, before killing the containers on lost nodes to speed up this cleanup process for the lost nodes.
Fix   HADTWO-496: Increasing priority of mapper containers in case they are relaunched due to node loss to avoid deadlock.
Change   HADTWO-234: Added the hdfs dfs -find subcommand, which is similar in functionality to the UNIX find command. Run hdfs dfs -help find on a cluster for details on how to use this subcommand.
Change   HADTWO-387: Added a plain text log link for running applications in the job history page.
Change   HADTWO-447: Added spot rebalancing support in Hadoop 2.
Change   HADTWO-498: Qubole has introduced container packing, which when enabled results in better downscaling.
Change   HADTWO-521: Autoscaling did not kick in until job progresses to 1%.

 

HBASE

Fix   HBAS-177: This fixes the problem faced by users, who could not view all the statistics on the HBase web server UI.

 

HIVE 0.13

New   HIVE-1074: Exporting Hive logs from cluster master to the default location periodically.

Fix   HIVE-1506: Fixed an issue which caused drop table commands for a managed table to succeed even in case of failure in actual data deletion.

 

HIVE 1.2

New   HIVE-1074: Exporting Hive logs from cluster master to the default location periodically.
Change   HIVE-1509: Fixed an issue in Hive due to which an empty array insertion into Parquet table was failing with ParquetEncodingException. Open Source issue - HIVE-13632.

 

QDS

New   UI-3832: Explore UI remembers last expanded database schema and opens it by default the next time.

New   UI-3936: You will be able to configure the interpreter mode for Spark notebooks in the cluster configuration page. This feature is available for beta access. Contact help@qubole.com to enable this feature on an account.
Fix   SQOOP-32: Added support for data export to Redshift data store on IAM-Role-based accounts.
Fix   UI-3453: Until now, the usage page blocks user from performing other actions on the page. This bug has fixed it and supports navigation to Usage and Report sections can be done irrespective of whether data is loaded on all widgets.
Fix   UI-3591: mailto links are saved correctly from the branding UI in the Control Panel.
Fix   UI-3822: The name field for a command can be cleared out.
Change   UI-3078: Instance Profile of a cluster can now be edited from the cluster configuration UI in the Control Panel.
Change   UI-3553: Tags are now supported in Templates.
Change   UI-3554: Add Done Parameter in the status drop-down list in Scheduler v2 UI.


SPARK

New   SPAR 907: Spark version 2.0.0 is available with this QDS release. 

Change   SPAR-1106: Added support for Spark version 1.6.2

Change   SPAR-1173: Spark external shuffle service is enabled by default for Spark clusters.
Change   SPAR-1105: Spark autoscaling improvements, fine-grained downscaling, downscale cached executors after idle time out, and support for open source dynamic allocation configurations, and so on are now available in Qubole Spark.

 

TEZ

Fix   QTEZ-56: Initializing the Tez component before the node bootstrap in an Hadoop 2 cluster.

 

ZEPPELIN/NOTEBOOKS

Fix   ZEP-242: Import and Export for notebooks is now available.

Fix   ZEP-408: With this fix, the node bootstrap changes are no longer needed to enable SparkR in Zeppelin and it will be enabled by default. This change also includes a sample Spark R notebook under notebook library.

 

List of Hotfixes Since 11th August 2016

None

Have more questions? Submit a request

Comments

Powered by Zendesk