Qubole Release Notes 09-June-2016

Release Version Number: 33.3.0


For details of what has changed in this version of QDS, see What’s New and List of Changes and Bug Fixes in this Release.

An Heads-up on the Formatted Results for Spark SQL Commands Feature Roll-out


Qubole will enable the formatted results for Spark SQL commands run through the Analyze command/query composer on 16th June 2016. After this feature is enabled, any Spark SQL command run through the Analyze command/query composer will have schema and tab-separated results.


What’s New


New Signup/Login Page for Qubole on AWS


Qubole has enabled a new signup/login page for Qubole on AWS on 27th May 2016. This page is mobile compatible, has better typography, and more in-tune with www.qubole.com. It also fixes a SAML sign-up bug. 


HDFS-based Up-Scaling for Hadoop-1 Clusters


Hadoop-1 clusters in QDS can now scale up if HDFS is running low on space. (All clusters take available space into account while downscaling already). HDFS capacity based up-scaling is turned off by default and can be turned on by setting mapred.hustler.dfs.autoscale.enable = true in hadoop overrides for the cluster.


Spark External Shuffle Service


The Spark external shuffle service is an auxiliary service which runs as part of the YARN NodeManager on each slave node in a Spark cluster. Qubole provides Spark external shuffle service in the following Spark versions:

  • Spark 1.5.1
  • Spark 1.6.0
  • Spark 1.6.1

Executors offload shuffle data to the external shuffle service, allowing those executors to be downscaled. This will help in auto-scaling Spark clusters effectively.
With Qubole Spark, the external shuffle service is optional and Qubole-based Spark job-level auto-scaling works whether or not the external shuffle service is enabled. See Spark External Shuffle Service for more information. 


Support to Start and Assign a Cluster from First Class Notebooks


Qubole supports starting a cluster from a First Class Notebook (FCN) and assigning a cluster to a notebook if the notebook does not have an assigned cluster. See Viewing a Notebook for more information.


UI Support to Select Private Subnets for a Cluster in a VPC


The QDS UI supports specifying a private subnet ID for the corresponding VPC. The QDS UI also allows you to add a bastion host DNS. See EC2 Settings for more information. 


Scheduler v2 UI Keyboard Shortcuts


Keyboard shortcuts are now available in the Scheduler v2 UI. Pressing Ctrl + / shows a list of available keyboard shortcuts.


Download Table Results in CSV/TSV/Raw Format from a Notebook


Qubole Notebooks support downloading table results of paragraphs in a CSV, TSV, or raw format. See Using a Notebook for more information. 


List of Changes and Bug Fixes in this Release



New  ACM-222: API Support to start/stop Airflow Clusters

New  ACM-144: In a previous release, we added capability to heuristically choose the most economical AZ in a region while launching master & slave nodes in EC2 Classic and default VPC. This change enables the same for any VPC by allowing multiple subnets (in different AZs) to be specified and an AZ is chosen using same criteria. The option to specify multiple subnets in cluster config is only available through API currently and does not work for all-spot clusters.

Contact help@qubole.com to enable this feature for clusters in your account.

New  ACM-204: Adds feature to parallelize launch of master and slave nodes in all-spot clusters. This can reduce the time taken to launch all-spot clusters.

Contact help@qubole.com to enable this feature for clusters in your account.



New  HAD-435: Autoscaling based on HDFS space availability.

Change  mapred.refresh.min.cluster.size has been set to true by default. This is required to minimize the cluster size using the PUSH configuration.




Fix  HADTWO-432: Fixed the Command Killed - Job keeps running bug


HIVE 1.2


Fix  HIVE-1312: Writing HiveDecimal to ORC can wrongly suppress present stream. Reference - Open Source jira: HIVE-13083


HIVE 0.13


Fix  HIVE-1314: Removed redundant round-trip calls to metastore for loading hive functions. This should reduce latency as well as mysql CPU usage for Hive queries.





Fix  QPIG-50: Results can be stored in S3 with a -schema flag.



New  PRES-548: You can use a Ruby client for Presto queries instead of a JAVA client. This reduces the latency by 3-4 seconds.

Contact help@qubole.com for enabling this feature in your account.

Fix  PRES-540: Use single instance of credential provider in IAM role-based setups to prevent overloading metadata service.

Change  PRES-615: Added support for non-equi inner joins by converting them to cross join with a filter node for the non-equi join condition.





New  QBOL-4922: API Support for submitting Airflow commands via a QDS Shell command

New  QBOL-4924: API Support to perform CRUD operations on an Airflow cluster

New  UI-2645: You can start/configure a cluster for read-only notebooks in order to make them editable.

New  UI-3125: UI support for Private subnets.

New  UI-3506: Qubole now allows file upload to a non-existent S3 path

New  UI-3619: Added message in Explore UI to mention that sample rows are a subset of the entire dataset

Fix  QBOL-5256: Proper API response for new result end point

Fix  QBOL-5342: Reduced verbosity of logs due to Paramiko

Fix  UI-2754: Keyboard shortcuts are now available in Scheduler v2. Pressing Ctrl + / shows a list of available keyboard shortcuts.

Fix  UI-3631: Fixed a bug so that the cluster status is updated in left panel of a First Class Notebook.

Fix  UI-3679: In a Notebook, the last cells/paragraphs at the bottom of the page were not visible. The issue has been fixed and the last cell/paragraph as well as the Remove button are now visible.

Fix  UI-3680: Fixed the issue of the scroll bar in the Analyze page that was not working.

Change  QBOL-4824: Show error message if unauthorized command is being accessed via the Analyze page.

Change  QBOL-5193: Composite commands are now supported for IAM Roles in addition to IAM Keys

Change  UI-3651: Remove AWS credentials and default location for a Qubole-managed account.





New  SPAR-721: Spark External shuffle service. Executors offload shuffle data to the external shuffle service, allowing those executors to be downscaled. This will help in autoscaling Spark clusters effectively.




New  ZEP-291: Table results of paragraphs in notebooks can be downloaded (in TSV, CSV or raw formats)

Fix  ZEP-172: Notification regarding the status (Success/Failure) of restarting the interpreter is displayed on restart.


List of Hotfixes Since 26th May 2016






Have more questions? Submit a request


Powered by Zendesk