Qubole Release Notes 29-Sep-2016

Release Version37.1.0

For details of what has changed in this version of QDS, see What is New and List of Changes and Bug Fixes in this Release.

What is New

UI Enhancements
The cluster label has been added as part of All Commands Report JSON output.

New Beta Features

Bootstrap Spark Notebook and Persistent Spark Interpreter

With this QDS release, Qubole supports a bootstrap notebook configuration in a Spark interpreter (SI). A bootstrap notebook runs before any paragraph or others using that associated SI, runs.

To configure this, add the interpreter property as zeppelin.interpreter.bootstrap.notebook with that notebook’s ID (a system generated ID) as the property value. The ID is available on the Notebook Details on the Notebooks UI. This property is already set to null in new notebooks.

An interpreter runs only when the command is run on the notebook. To avoid any issue that may occur due to an absence of a running interpreter, Qubole supports an SI that can be continuously run without any interruption. Qubole automatically restarts such SI in case of driver crashes, manual stop of interpreter or programmatic stop using sc.stop. Such an interpreter is called Persistent SI.
To configure this, add zeppelin.interpreter.persistent as the interpreter property with its value as true. This property is already set to false for new SIs.

The Spark bootstrap notebook and persistent SI features are not enabled by default. Contact help@qubole.com to enable this feature for the Qubole account.

The bootstrap notebook and persistent SI properties are independent of each other. See the documentation for more information. 

List of Changes and Bug Fixes in this Release

Airflow

New   QBOL-5706: Example DAGs in the Airflow Webserver Dashboard. Examples include usage of Qubole and other operators shipped inside Airflow.

HADOOP 2

Fix   HADTWO-568: With this fix, dfs.datanode.max.transfer.threads would be auto-configured based on the instance type.
Change   HADTWO-616: Rubix is available on Hadoop 2 clusters now. Hive jobs on MR and Tez can utilize Rubix to cache data using:
set fs.s3n.impl=com.qubole.rubix.hadoop2.CachingHadoop2FileSystem;
set fs.s3.impl=com.qubole.rubix.hadoop2.CachingHadoop2FileSystem;
Change   HADTWO-618: Enabled Continuous Scheduling by default in Hadoop2.


PRESTO

Fix   PRES-721: Adding code for throwing error if the sum of query.max-memory-per-node and resources.reserved-system-memory is greater than the heap memory. resources.reserved-system-memory (if not mentioned specifically) defaults to 40% of heap memory. Heap memory is the mentioned java XMX, if not mentioned defaults to 70% of the cluster`s slave machine configuration.
Fix   PRES-748: By default, the Rubix split size with Presto is now equal to hive.max-split-size config of Presto. It can be overridden by setting hadoop.cache.data.split.size.
Fix   PRES-761: Intermittent HIVE_CURSOR_ERROR failure in queries with Rubix enabled is now fixed.


QDS

Fix   QBOL-5723: Fixed the error with this message, Error parsing the heterogeneous config seen in logs due to an empty value being passed for the heterogeneous configuration.
Fix   UI-3639: Fixed an issue in the Control Panel where the left menu scrolls off the page header.
Fix   UI-4142: Fixed an issue in the Manage Clusters tab of the Control Panel, where the checkbox state of Ganglia monitoring is not saved while creating/editing an HBase cluster.
Change   UI-4066: The helpdesk ticket priority option has been replaced with the ticket severity option in Help Center > Submit a ticket.
Change   UI-4377: The cluster label has been added as part of the All Commands Report JSON output.

 

SPARK

Fix   SPAR-1127: Spark autoscaling will have better executor failure tolerance computed based on max executors.
Fix   SPAR-1205: Fixed the Spark DAG visualization for Spark-1.5.1
Fix   SPAR-1263: Fixed crash in saveAsNewAPIHadoopFile for non-S3 save operations.
Change   SPAR-1237: Spark autoscaling improvements like fine-grained downscaling, downscale cached executors after idle time out, support for open source dynamic allocation configs etc are now available for Spark-2.0.0.
Change   SPAR-1255: The Spark external shuffle service enabled by default for Spark 2.0.

 

ZEPPELIN/NOTEBOOKS

New   ZEP-195: With this QDS release, Qubole supports a bootstrap notebook configuration in a Spark interpreter (SI). A bootstrap notebook runs before any paragraph or others using that associated SI, runs. 

To configure this, add the interpreter property as zeppelin.interpreter.bootstrap.notebook with that notebook’s ID (a system generated ID) as the property value. The ID is available on the Notebook Details on the Notebooks UI. This property is already set to null in new notebooks.

An interpreter runs only when the command is run on the notebook. To avoid any issue that may occur due to an absence of a running interpreter, Qubole supports an SI that can be continuously run without any interruption. Qubole automatically restarts such SI in case of driver crashes, manual stop of interpreter or programmatic stop using sc.stop. Such an interpreter is called Persistent SI.

To configure this, add zeppelin.interpreter.persistent as the interpreter property with its value as true. This property is already set to false for new SIs.

The Spark bootstrap notebook and persistent SI features are not enabled by default. Contact help@qubole.com to enable this feature for the Qubole account.

Fix    ZEP-510: GitHub linking accepts files and folders with spaces and other invalid URL characters.


List of Hotfixes Since 21st September 2016


Fix   HADTWO-627: Fixed issues in FairScheduler caused by dot in the username.

Have more questions? Submit a request

Comments

Powered by Zendesk