Qubole Release Notes for QDS-on-AWS Version R50 23-Jan-2018

Release Version: 50.9.0

Qubole provides Qubole Data Service (QDS) on Amazon Web Service (AWS), Microsoft Azure, and Oracle Cloud Infrastructure (OCI). 

For details of what has changed in this version of QDS-on-AWS, see:

  • What is New in QDS-on-AWS
  • New Alpha/Beta Features in QDS-on-AWS
  • List of Changes and Bug Fixes in QDS-on-AWS


What is New in QDS-on-AWS
 

New in AWS Cluster Management

These are the new enhancements in AWS clusters in this QDS release:

  • Spot Instance Requests will now have tags attached to them.
  • On-Demand nodes’ tagging and create operations are combined into one atomic operation now.
  • QDS supports defining default account-level cluster tags. This can be set in Account Settings page of Control Panel. Users can define user-level EC2 tags. This feature is not enabled by default on a QDS account. Create a ticket with Qubole Support to enable this feature on your account. 
    Capturing values for user-level tags when a user logs in using SAML. The mapping of the attribute name (in SAML response) and the tag name, must be defined in SAML provider active admin page.


    If the account-level default cluster tags and user-level EC2 tags exist, they would be prepopulated when a cluster is created from the UI. These are visible when a user creates a  cluster. For more information, see the documentation
  • The Idle Cluster Timeout setting at the account level and cluster level can be configured in hours and minutes. It supports AWS per-second billing. 

    This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

  • QDS supports AWS G3 and P3 instance families.

  • Users will not be allowed to run a command from the Analyze page on a cluster to which you are denied access to start it. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

   

 

New Parameters for Server-side Encryption in Hadoop DistCp

 

The following parameters can be set for server-side encryption in Hadoop DistCp along with the other parameters:

  • s3ServerSideEncryption: It will enable encryption of data at the object level as S3 writes it to disk.
  • s3SSEAlgorithm: It is used for encryption. If you do not specify it but s3ServerSideEncryption is enabled, then the AES256 algorithm is used by default. Valid values are AES256, SSE-KMS, and SSE-C.
  • encryptionKey: If SSE-KMS or SSE-C is specified in the algorithm, then using this parameter, you can specify the key using which the data is encrypted. In case, the algorithm is SSE-KMS, the key is not mandatory as AWS KMS would be used. If the algorithm is SSE-C, then specify the key else the job fails.

For more information, see this documentation

 

New Enhancements in Hive

These are the Hive enhancements in this release:

  • Metrics are added for:
    • The number of queries executed for each execution engine through HiveServer2 in Hive 2.1
    • The number of Hive operations waiting for compilation through HiveServer2 in Hive 2.1
    • The number of open/closed/abandoned sessions through HiveServer2 in Hive 2.1
    • Tracking the queries execution lifecycle times fired through HiveServer2 in Hive 2.1

         These metrics are available in HS2 UI and Ganglia.

  • QDS UI supports adding a personal hive bootstrap in Control Panel > Hive Bootstrap. For more information, see the documentation  
  • Hive tables to which users do not have read access will not be visible in the QDS UI that is as part of query result or Hive metastore on Analyze/Explore. 
  • HiveServer2 Metrics are now seen as part of the Datadog monitoring service. The HiveServer2 metrics are visible as part of Ganglia monitoring and from this release, they are also seen on the Datadog monitoring.

 

New Enhancements in Presto

These are the Presto enhancements in this release:

  • The Presto Datadog integration has been revamped and the following metrics are now reported to Datadog from Presto. For more information, see the documentation. The following metrics are collected and reported today with it:
    • Failed Query count
    • Finished Query count
    • Running Query count
    • Average planning time across queries
    • Maximum GC time across all nodes in the cluster
    • Maximum GC count across all nodes in the cluster
    • Number of worker nodes seen by Presto
    • Service Check of Presto Service
    • Read request rate in cluster
    • Number of failures in requests from master to slave nodes
  • The query.max-memory default value has been changed to 100TB. If you want to limit the maximum memory a query can use, you can override this to a lower value either by setting this as a Presto override or setting the query.max_memory session property.
  • Qubole Presto supports user overridden IAM roles. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

  • A query.max-execution-time configuration and query_max_execution_time session properties have been added. These two properties enforce a time limit on the query execution time. Unlike the existing query.max-run-time property which enforces time limit starting from query creation, these two properties only consider the time spent in the query execution phase.

  • The Postgres jar version has been downgraded back to 9.3-1102-jdbc41 in 0.180 to maintain compatibility with Redshift.

New Enhancements in Qubole Scheduler

These are the Qubole Scheduler enhancements in this release:

  • ACL permissions can be now be set for a schedule (Read, Update, Delete, Clone, or Manage permission). Based on the allowed/denied permission, any user can see the available options. The ACLs can be set through the Scheduler UI and Object Policy API.  For more information, see the UI documentation and API documentation

Now, a user must have Scheduler and Command resources permissions (Control Panel > Manage Roles > Policy Actions) to see Schedule and Schedule Instances on the UI.

New Enhancements in Spark

These are the Qubole Spark enhancements in this release:

  • The default Spark version in the Spark cluster’s UI drop-down list is changed to 2.1-latest. Select the version carefully while creating new Spark clusters.
  • 36-hour timeout has been removed for running Spark Streaming Jobs from Analyze. This support is available for an alpha access.
  • Performance optimizations in the INSERT INTO/OVERWRITE flow for Hive tables in Spark. This feature allows Spark to directly write files in final destination instead of writing to a temporary staging directory which improves performance.

  • There is a change intended at users, who are using QuboleDBTap to access data stores in Spark.

    The QuboleDBTap class and companion object has been copied from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap for Spark 2.0.0 and later versions.

    com.qubole.QuboleDBTap will still be maintained to keep backward compatibility for all existing versions of Spark. However, Qubole strongly recommends migrating from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap as the support for com.qubole.QuboleDBTap will be removed starting from Spark 2.3.0. QuboleDBTap and its methods can only be used by importing org.apache.spark.sql.qubole.QuboleDBTap.

 

New Enhancements in Notebooks

These are the Notebooks' enhancements in this release:

  • QDS provides Scheduler support on the Dashboards UI. Notebooks Dashboards are refreshable by setting the interval and selecting refresh periodically.
  • A new option to configure Notebook Dashboards has been added.
  • You can use the TAB key to autocomplete non-markdown paragraphs in a notebook.
  • QDS displays the opened Dashboard name in the browser tab.
  • Simplified Spark interpreter properties by removing unnecessary default interpreter properties are enabled by default. This feature is reflected only on new Spark interpreters.
  • QDS maintains a notebook’s history when it is run from the Analyze and Scheduler UI.
  • A user can upload or download a file to/from an S3 location through the Analyze and Notebooks page. This feature can be used with:
    • Permission to download and upload an S3 file that can be controlled through granular permission model
    • An ability to hide object store widget for certain users
  • QDS now supports auto-complete for SQL. Users can see suggestions for Hive keywords, functions and table names in the default schema. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
  • A user can rename an open notebook or a dashboard from its object name.

New Enhancements in Dashboards

  • Dashboards are generally available with this release. For more information, see the documentation
  • A user can change the Dashboard theme.
  • A user can rename an open notebook or a dashboard from its UI header.
  • QDS provides Scheduler support on the Dashboards UI.

 

UI Enhancements

These are the UI enhancements in this release:

  • You can disable/enable the Zendesk ticket creation on an account basis.
  • The Getting Started Splash Screen has been added to the QDS UI.
  • The navigation to the main pages that is the drop-down of the QDS UI main page along with the UI skin color of the top horizontal bar has changed in this release. 
  • A new policy action show_token for a system/user role has been added to the Control Panel > Manage Roles which controls the visibility of the API Token column in the Control Panel > My Accounts page.
  • On the Control Panel UI >  Account Settings page, the storage/compute and settings section are under a single section called Access Settings and the Access Mode is moved from Account Settings to Access Settings. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

 

 

New Alpha/Beta Features in QDS-on-AWS

Handling Spot Node Loss on Spark Clusters

Spot Node loss is handled more gracefully on Spark versions 2.1.0, 2.1.1, and 2.2.0.

Instead of relying on the heartbeat to figure out a node failure, Qubole proactively figures out nodes undergoing Spot loss and stops scheduling tasks on the corresponding executors.

This feature is available for a beta access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

Running Spark Streaming Job from the QDS UI and API

Running Spark Streaming Job from Analyze UI or Commands API for more than 36 hours is now supported. This feature is available for an alpha access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

 

Qubole Introduces Deep Learning Clusters

Qubole introduces a Deep Learning cluster type.

Deep Learning clusters are mostly used to run python code snippets using Deep Learning packages such as Tensorflow, mxnet, and keras. Qubole supports distributed Deep Learning, which runs Tensorflow on Spark. In Deep Learning notebooks, Qubole uses the Spark interpreter group with the pyspark interpreter set as the default interpreter. Only GPU instances are supported on Deep Learning clusters. Currently, a Deep Learning cluster starts only in the us-east-1 AWS region. Hence, configure the cluster only in the us-east-1 AWS region.

This feature is available for a beta access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

 

Qubole Supports Hadoop 2.8

Qubole supports Hadoop 2.8 on clusters with this release.

This feature is available for a beta access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

List of Changes and Bug Fixes in Qubole-on-AWS

AIRFLOW

Bug Fixes

Fix   AIR-56: The failing Hide/Show-paused DAG's button on the Airflow UI has been fixed.
Fix   EAM-541: The Airflow disk space issue is resolved. All the scheduler child process logs are moved to EBS, which will be cleared with a rotation policy of 2 days (configurable from airflow.cfg).


Enhancements

Change   AIR-57: Airflow 1.8.2 is the default version and it is reflected in the Airflow cluster UI.

AWS CLUSTER MANAGEMENT


New Features


New   ACM-1530: Spot Instance Requests will now have tags attached to them.
New   ACM-1592: On-Demand nodes’ tagging and create operations are combined into one atomic operation now.

New   ACM-1716: For an active heterogeneous cluster, the number of minimum and maximum nodes are shown on the UI.

New   ACM-1900: The Idle Cluster Timeout setting at the account level and cluster level can be configured in hours and minutes. It supports AWS per-second billing. 

This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
New   AD-325: Users can create read-only default cluster tags in the Control Panel > Account Settings page and SAML Superadmin page, which will be non-editable when a new cluster is created.


New    MW-1109: QDS supports defining default account-level cluster tags. This can be set in Account Settings page of Control Panel. Users can define user-level EC2 tags. This feature is not enabled by default on a QDS account. Create a ticket with Qubole Support to enable this feature on your account. 
Capturing values for user-level tags when a user logs in using SAML. The mapping of the attribute name (in SAML response) and the tag name, must be defined in SAML provider active admin page. 

If the account-level default cluster tags and user-level EC2 tags exist, they would be pre populated when a cluster is created from the UI. These are visible when a user creates a              cluster.


Bug Fixes

Fix    ACM-1746: Tagging of volumes which used to fail in heterogeneous clusters works now.
Fix    ACM-1788: Fixed auto-scaling logs to display a warning only when needed.
Fix    ACM-1798: Deleted subnets are shown as an error condition on the UI and it will now be possible to update a cluster's configuration in such a situation.

Fix    ACM-1846: The issue in which a cluster that had read permission denied was still visible to the user in the Clusters UI page has been resolved now.
Fix   ACM-1849: Cluster usage report API will not be accessible for clusters without the read permission. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Fix   ACM-1856: Enhancements have been done on the QDS platform to reduce the CPU utilization.

Fix    QBOL-6183: During the cleanup task, the clusters that are inactive and to be terminated are first identified. In the time between processing and terminating it, a new command may get added up to the cluster. Currently, as the cluster is being labeled as to be terminated, it gets killed.
To fix this issue, before terminating a cluster, checking if any new session has been added to it. If yes, do not terminate.

DEEP LEARNING

Enhancements

Change    DS-24: pyspark is the default interpreter for Deep learning notebooks as opposed to scala for the spark interpreter group.

Change   ZEP-1326: New cluster type Deep Learning has been added to the QDS UI.

HADOOP 2

Bug Fixes

Fix    HADTWO-358: This fix binds the hadoop daemons such as Resource Manager, NameNode, timeline server and Job History Server to 0.0.0.0. Earlier, these daemons were bound to master DNS which caused a problem when customer attached an elastic IP address to the private IP address.

Fix    HADTWO-1016: The following parameters can be set for server-side encryption along with the other parameters:

  • s3ServerSideEncryption: It will enable encryption of data at the object level as S3 writes it to disk.
  • s3SSEAlgorithm: It is used for encryption. If you do not specify it but s3ServerSideEncryption is enabled, then the AES256 algorithm is used by default. Valid values are AES256, SSE-KMS, and SSE-C.
  • encryptionKey: If SSE-KMS or SSE-C is specified in the algorithm, then using this parameter, you can specify the key using which the data is encrypted. In case, the algorithm is SSE-KMS, the key is not mandatory as AWS KMS would be used. If the algorithm is SSE-C, then specify the key else the job fails.

Fix    HADTWO-1134: The issue in which the S3AInputstream closes the HTTP connection instead of releasing it back to connection pool has been fixed.

Fix    HADTWO-1166: While removing replicas from some nodes when their number is greater than maximum replication factor, a replica was removed from an On-Demand node without checking if it is the only On-Demand node containing the replica. This issue has been resolved.
Fix    HADTWO-1197: The Application UI log files rendering failure - intermittent issue has been resolved.

Enhancements

Change    HADTWO-1091: The S3A filesystem can now automatically detect the endpoint of an S3 bucket and users do not have to explicitly set endpoint in the configuration. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

After this feature is enabled, the S3a filesystem would not honor endpoint-specific properties such as fs.s3a.endpoint, qubole.s3.standard.endpoint, and fs.s3.awsBucketToRegionMapping.
Change    HADTWO-1189: QDS now supports setting configurable time-to-live (TTL) in JVMs (launched on all cluster types except Airflow and Presto clusters) for DNS lookups. Recommended TTL is 60 seconds. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Change    HADTWO-1190: Fix to make autoscaling more robust in case of network failures when connecting to the NameNode.

Change    HADTWO-1214: Occasional stream closed errors when loading jars have been fixed.

 

HIVE

Bug Fixes

Fix    HIVE-2617: HiveServer2 (HS2) script changes to use Java8 and G1GC in Hive2.1. Create a ticket with Qubole Support to enable this on your account.

Fix    HIVE-2629: The workflow that was stuck for a long time after each query completion has been resolved.
Fix    HIVE-2666: The issue in which windowing functions that were not failing with an invalid function error has been resolved.

 

Enhancements

AD-316: QDS UI supports adding a personal hive bootstrap in Control Panel > Hive Bootstrap.
AN-447: Hive tables to which users do not have read access will not be visible in the QDS UI that is as part of query result or Hive metastore on Analyze/Explore.

Change    HIVE-1931: An HS2 thrift port can be configured using hs2_thrift_port parameter on the Hadoop 2 (Hive) cluster to configure HiveServer2 Port. It is supported only on API.
Change    HIVE-2605: Metrics are added for the number of queries executed for each execution engine through HiveServer2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.
Change    HIVE-2608: Metrics are added for the number of Hive operations waiting for compilation through HiveServer2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.
Change    HIVE-2615: Metrics are added to the number of open/closed/abandoned sessions in HS2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.

Change    HIVE-2625: HiveServer2 is supported on small cluster instance types.
Change   HIVE-2630: Metrics are added for tracking the queries execution lifecycle times fired through HiveServer2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.

Change    HIVE-2676: HiveServer2 Metrics are now seen as part of the Datadog monitoring service.


PRESTO

New Features

New    PRES-893: Qubole Presto supports user overridden IAM roles. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

New    PRES-1103: The Presto Datadog integration has been revamped and the following metrics are now reported to Datadog from Presto. The following metrics are collected and reported today with it:

  • Failed Query count
  • Finished Query count
  • Running Query count
  • Average planning time across queries
  • Maximum GC time across all nodes in the cluster
  • Maximum GC count across all nodes in the cluster
  • Number of worker nodes seen by Presto
  • Service Check of Presto Service
  • Read request rate in cluster
  • Number of failures in requests from master to slave nodes

 

 

Bug Fixes

Fix    PRES-1235: Fixed the predicate pushdown on integer columns in a Parquet format.
Fix    PRES-1302: Support camel-cased field names in the Parquet data.

Fix   PRES-1426: Presto 0.119 will not be available from this QDS release. It was deprecated and support was stopped earlier for this version. All existing clusters with Presto 0.119 will start with Presto 0.157 on a cluster restart.

Fix  PRES-1462: The Presto 0.180 UI must now display correct values for Rows/Sec, Bytes/Sec, and Parallelism instead of NaNs.

Fix  UI-4842: QDS notifies with warnings while overriding configuration through Presto overrides.

 

Enhancements


Change    PRES-735: The default value of query.max-memory has been changed to 100TB. If you want to limit the maximum memory a query can use, you can override this to a lower value either by setting this as a Presto override or setting the query.max-memory session property.

Change    PRES-1356: A query.max-execution-time configuration and query_max_execution_time session properties have been added. These two properties enforce a time limit on the query execution time. Unlike the existing query.max-run-time property which enforces time limit starting from query creation, these two properties only consider the time spent in the query execution phase.
Change    PRES-1366: The AWS Spot Termination Notification is received by Presto version 0.180 and the node that is going to be taken away is removed from active nodes' list to prevent any more tasks getting scheduled on it.
Change    PRES-1371: The Postgres jar version has been downgraded back to 9.3-1102-jdbc41 in 0.180 to maintain compatibility with Redshift.


QDS

New Features

New    EAM-339: You can disable/enable the Zendesk ticket creation on an account basis.
New    GROW-3: The Getting Started Splash Screen has been added to the QDS UI.

New    AD-99: A new policy action show_token for a system/user role has been added in the Control Panel > Manage Roles which controls the visibility of the API Token column in the Control Panel > My Accounts page.
New    AD-292: For a new user, who signs up on QDS, the default location is now editable by default.
 

Bug Fixes

Fix    AD-75: On the Control Panel UI > Account Settings page, the storage/compute and settings section are under a single section called Access Settings and the Access Mode is moved from Account Settings to Access Settings. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Fix    EAM-449: The SAML Login issue that was faced by a few users has been resolved.
Fix    EAM-457: Deprecating the remove PUT request and replacing it with DELETE request in the Command Templates API.
Example: PUT /api/v1.2/command_templates/<id>/remove is removed.

Fix    MW-1327: A role can be assigned to users through the usernames. User email IDs can be used to add/delete users to/from a group through the Group API.

Enhancements

Change   AD-285: These additional metrics will be used to calculate CPU usage in all command reports:

  • org.apache.tez.common.counters.DAGCounter.AM_CPU_MILLISECONDS
  • Presto Counters.CPU_TIME

Change   EAM-456: In the viewing command history API, the per_page parameter has been modified. Its value is an integer and it is the number of commands to be retrieved per page. Its maximum value can be 100. Retrieve the next 100 commands based on the last command ID for a given QDS account.
Change   MW-1168: On the Analyze UI, you can compose a query or specify a query path for Redshift type of query.
Change   UI-5383: The navigation to the main pages that is the drop-down of the QDS UI main page along with the UI skin color of the top horizontal bar have changed in this release.

 

SCHEDULER

New Features

New    EAM-533: ACL permissions can be now be set for a schedule (Read, Update, Delete, Clone, or Manage permission). Based on the allowed/denied permission, any user can see the available options. The ACLs can be set through the Scheduler UI and Object Policy API.

Bug Fixes

Fix    

Enhancements

Change    EAM-286: Create Public Preview Query API has been added to a new Scheduler API call. After entering the parameters for a command, you can view the preview of it by using the endpoint https://<QDS env>/api/v1.2/scheduler/preview_macro. Where QDS can be an endpoint of QDS-on-AWS. For more information, see Supported Qubole Endpoints.
Change    EAM-376: A user can set command timeout while creating scheduler using API/UI. It must be less than 36 hours.

Change    EAM-466: Now, a user must have Scheduler and Command resources permissions (Roles > Policy Actions) to see Schedule and Schedule Instances on the UI.
Change    SCHED-142: The issue in which the scheduler notifications used to have 10-12 minutes delivery time has been resolved. Now, the notification should be delivered within 5 minutes.

SPARK

New Features

New    SPAR-2081: Performance optimizations in the INSERT INTO/OVERWRITE flow for Hive tables in Spark. This feature allows Spark to directly write files in final destination instead of writing to a temporary staging directory which improves performance.

 

Backported Fixes/Changes from Open Source

Fix    SPAR-2177: Backported from open source JIRA SPARK-16628.
Before Hive 2.0, ORC File schema has invalid column names like _col1 and _col2. This is a well-known limitation and there are several Apache Spark issues with spark.sql.hive.convertMetastoreOrc=true. This fix ignores ORC File schema and uses Spark schema.

Bug Fixes

Fix    SPAR-1487: SparkSQL used to throw errors and bail out if the Hive table’s location is not the same as the hive staging dir automatically chosen by QDS. We have fixed that problem and automatically switch to using a staging dir which is on the same object store bucket as the table’s location.

Fix    SPAR-1640: Spark Hive Authorization: The issue with trying to drop a database/table that already exists has been resolved.
Fix    SPAR-1656: Spark Hive Authorization: With this fix, Spark will now honor authorization while creating and selecting from views. A user will be allowed to create views only from the source tables that the user has select privileges on. Similarly, a user will be allowed to select from only those views that he/she has select privileges on.
Fix    SPAR-1658: Spark Hive Authorization: With this fix, Spark will now honor authorization while altering views. A user will be able to alter a view only if he/she has ownership privileges on that view.
Fix    SPAR-1719: Customers who have added hive.security.authorization.enabled=true in their Spark cluster’s Hadoop override configurations will not face any issues while running Spark commands now.

Fix    SPAR-1930: Fixed the issue where a Spark command failed when a semicolon is present in query comments.

Fix    SPAR-1996: Updating the spark-avro version in Qubole-Spark 2.2.0 to 4.0.0.

Fix    SPAR-2011: The SparkEventLog listener is used to write events generated by Spark to an event log file, used later by Spark History Server. These writes are synchronous in nature and done on a file located on HDFS. When there are too many events being generated from a Spark application or HDFS becomes slow, we end up in situations where these events start to get dropped because of event queue getting full. With this fix, events are written to a local file which can sustain much higher speed to events and sync them to HDFS asynchronously.
Fix    SPAR-2070: The issue in which the Spark History Server was failing to list running Spark applications has been resolved.

SPAR-2078: Proper formatting is done in the Spark History Server response in case the application that is looked for does not exist.

Fix    SPAR-2132: Fixed the problem with Spark's optimized S3 listing when it was dealing with empty folders.

 

 

Enhancements

Change    SPAR-1934: When a new Spark cluster is created in QDS UI, the Spark version chosen by default used to be 1.6.1 and this is now changed it to 2.1-latest.
Change    SPAR-2001 and SPAR-2154: Spot Node loss is handled more gracefully on Spark versions 2.1.0, 2.1.1, and 2.2.0.

Instead of relying on the heartbeat to figure out a node failure, Qubole proactively figures out nodes undergoing Spot loss and stops scheduling tasks on the corresponding executors.

This feature is available for a beta access. Create a ticket with Qubole Support to enable this on your account.
Change    SPAR-1796: Remove the 36-hour timeout limitation when running streaming applications from the Analyze command editor.

Change    SPAR-2027: Spark metrics can be seen on the Datadog and Ganglia Monitoring service.

Change    SPAR-2152: There is a change intended at users, who are using QuboleDBTap to access data stores in Spark.

The QuboleDBTap class and companion object has been copied from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap for Spark 2.0.0 and later versions.

com.qubole.QuboleDBTap will still be maintained to keep backward compatibility for all existing versions of Spark. However, Qubole strongly recommends migrating from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap as the support for com.qubole.QuboleDBTap will be removed starting from Spark 2.3.0. QuboleDBTap and its methods can only be used by importing org.apache.spark.sql.qubole.QuboleDBTap.

STREAMX

Enhancements

Change    SX-60: We have now added support to establish a secure connection between StreamX and Kafka clusters. StreamX cluster can now connect to your SSL enabled Kafka clusters.

TEZ

Bug Fixes

Fix    QTEZ-211: In the Tez mode, dynamic partitioning query with UNION ALL fails to insert to a partitioned table at moveTask with Invalid partition key and values error due to an additional subdirectory getting created at temp location. This issue has been fixed and ensured that this additional subdirectory does not get created.

 

ZEPPELIN/NOTEBOOKS

New Features

New    MW-696 and ZEP-1488: QDS provides Scheduler support on the Dashboards UI. Notebooks Dashboard are refreshable by setting the interval and selecting refresh periodically. 

New    ZEP-911: QDS maintains a notebook’s history when it is run from the Analyze and Scheduler UI.
New    ZEP-1434 and ZEP-1297: A user can change the Dashboard theme.
New    ZEP-1562: QDS now supports auto-complete for SQL. Users can see suggestions for Hive keywords, functions and table names in the default schema. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
New    ZEP-1608: A new option to configure Notebook Dashboards has been added.

 

Bug Fixes

Fix    ZEP-39: A detailed progress of Spark paragraphs run on notebooks, will be offered when zeppelin.enable_detailed_spark_progress is set to true. This feature is enabled by default.

ZEP-581: Fixed the issue in which it was unable to highlight the comments and code formatting for paragraphs in pyspark.
ZEP-601: Fixed the issue in which pyspark syntax highlighting was not working (instead it seemed to use Scala).

Fix    ZEP-929: You can now use z.showplot(plotObj) to directly plot matplotlib graphs instead of defining the show function explicitly.
Fix    ZEP-979: The issue in which the Spark cluster displayed an error, Pyspark is not responding when pyspark was run for the first time after cluster start, has been resolved.

Fix    ZEP-1237: A user can upload or download a file to/from an S3 location through the Analyze and Notebooks page. This feature can be used with:
Permission to download and upload an S3 file that can be controlled through granular permission model
An ability to hide object store widget for certain users

Fix    ZEP-1242: A user can delete paragraphs in a paragraph until just one paragraph would remain in the notebook. When there is just one paragraph in the notebook, the Remove button will be unavailable.
Fix    ZEP-1464: Fixed the corruption of interpreter JSON due to forceful restarts of the Zeppelin server at the time of initialization.

Fix    ZEP-1524: Exceptions appearing in Zeppelin logs related to Zeppelin not being able to handle events from the remote interpreter when a Spark app is started using Zeppelin are fixed. Although this issue did not have any impact on the functionality, the logs were being flooded with exceptions in the above-mentioned scenario.
Fix    ZEP-1626: The issue in which when a notebook is moved to a different cluster, a user on an edge-case may sometime see their notebook contents getting reverted to the contents present on the older cluster, has been resolved.

Fix    ZEP-1643: Fixed the issue where a SQL table was not rendering properly when a column had newline character in it. This fix is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Fix    ZEP-1715: Invalid strings in a paragraph are highlighted with a better color.
Fix    ZEP-1718: The issue in which downloaded results in CSV and TSV format were different when compared with Notebook graph data has been fixed.
Fix    ZEP-1895: Fixed the failure in a notebook that is run with an empty paragraph from Notebooks UI as well as via the Notebook command when pyspark is selected as the default interpreter.

 

Enhancements

Change    ZEP-405: You can use the TAB key to autocomplete non-markdown paragraphs in a notebook.

Change    ZEP-1181: A user can rename an open notebook or a dashboard from its object name.
Change    ZEP-1297: Improved Dashboard mode visuals.
Change    ZEP-1406: A new package environment gets created automatically when a new Spark cluster is created.
Change    ZEP-1569: The user can now see the list of all pre-installed and user packages from the new Environments UI.
Change    ZEP-1607: QDS displays the opened Dashboard name in the browser tab.

Change    ZEP-1658: The user can select only the available frequency values while scheduling a dashboard, which is based on the maximum allowed scheduler instances per day.
Change    ZEP-1670: Cloning a Spark cluster clones the default package environment attached to the parent cluster.
Change    ZEP-1681: The tooltip about viewing other interpreters on the Interpreter page has been enhanced.
Change    ZEP-1686: User is notified with a message -
Spark Context coming up. Can't cancel now. if spark context is coming up while canceling a paragraph in the notebook. This enhancement is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Change    ZEP-1700: A user can view a notebook/dashboard information by hovering over the info icon on the side panel.
Change    ZEP-1714: The Notebook UI’s right-sidebar is resizable now.
Change    ZEP-1756: Visual enhancements have been done on Notebooks and Dashboards UI.
Change    ZEP-1768: Creating a Spark cluster creates a default environment. The parameters for the environment can be specified under Advanced Configuration in the cluster’s UI page.
Change    ZEP-1772: Simplified Spark interpreter properties by removing unnecessary default interpreter properties. This feature is reflected only on new Spark interpreters and is enabled by default.
Change    ZEP-1823: Dashboards are generally available with this release.


List of Hotfixes in QDS-on-AWS Since 24th October 2017


Fix    AD-263: The Same as Default Compute sub option is selected by default on a Cluster’s EC2 Settings UI.

Fix    AN-572: Fixed an issue that appeared while filtering cluster labels. They are now matched exactly.

Fix    EAM-628: The issue in which a canceled Spark Job that was still running for 34 hours until the cluster down, has been resolved.

Fix    AN-560: The JDBC driver charset error has been resolved.

Fix    HADTWO-839: Upgraded the AWS SDK version used by the S3A file system to 1.10.77 from 1.10.6.
Fix    HADTWO-1088: Retry object listing in the s3a file system if it fails due to XML parsing.

New   HADTWO-1112: As part of QDS per-second scaling enhancements a node cool down time period has been added for downscaling in YARN-based clusters. This feature is available for a beta access. Create a ticket with Qubole Support to enable it on your account.

Fix    HADTWO-1152: In YARN autoscaling, HDFS decommission used to start on all nodes that are in YARN Graceful Shutdown (GS) state. As a result of it, sometimes more than required nodes were downscaled.
With this fix, a number of nodes that can be decommissioned from HDFS are calculated and HDFS decommission is started only on those many nodes in YARN GS.
Fix    HADTWO-1153: This issue fixes the calculation of a number of nodes that can be decommissioned from HDFS while downscaling in YARN autoscaling.

Change    HADTWO-1144: Ported fix for MAPREDUCE-6154.
Fix    HADTWO-1186: The issue in which a Spark job failed because of NodeManager starting twice, has been resolved.
Change    HADTWO-1212: Ported HADOOP-13164 to avoid object listing during output stream close in the s3a file system. In addition, maximum retries in object listing are made configurable. By default, the maximum retries are 10.

Fix    PRES-1300: Parquet Predicate pushdown logic is fixed to honor the data type defined in Hive schema.

Fix    PRES-1462: The Presto 0.180 UI must now show correct values for Rows/Sec, Bytes/Sec, and Parallelism, instead of NaNs.
Fix    PRES-1471: Change the v1/node API of presto master to skip reporting exceptions that cannot be serialized.
Fix    PRES-1407: The Parquet performance deterioration between Presto 0.157 and0.180 versions has been resolved.
Fix    SPAR-1927: The Spark UI URL issue that led to 500 error has been resolved.
Change    SPAR-2038: Node downscaling depends on the Spark shuffle service not having any shuffle data for active Spark applications. But this check was broken and nodes used to remain alive even after no shuffle data was actually present. So nodes used to stay up for long duration than necessary. This was a problem only in 2.x.y versions of Spark. These issues have been fixed in all the affected Spark versions.
New   ZEP-1017: The scrolling behavior for the paragraph output has been added.
Fix    ZEP-1328: Lost interpreter and customizations have been restored.
Fix    ZEP-1465: It is a bug-fix for a corner case where canceling a paragraph in a notebook could lead to the subsequent run of that paragraph being aborted automatically.
Change    ZEP-1685: The interpreter.json that is downloaded from the object store is validated during a cluster start to catch incomplete file downloads.

 

List of Hot-fixes in QDS-on-AWS After 23rd January 2018 

Release Version: 50.82.0

Change    PRES-1596: QDS supports Presto 0.193 as the open- beta version on Presto clusters.

Release Version: 50.81.0

Fix   AN-874: The issue where tables were not displayed in the Analyze > Tables when Hive authorization was enabled, has been resolved now.

Release Version: 50.73.0

Fix   INFRA-778: The issue in which live logs were not displayed for running queries has been resolved.

Release Version: 50.67.0

Fix   INFRA-550: The issue in which a completed Presto query still showed running as its status has been resolved.
Fix   INFRA-558: The issue in which a Spark Command run on the Analyze UI failed to fetch column names has been resolved.

Release Version: 50.66.0

Fix   INFRA-323: The issue in which a Tez Job that was taking a longer time to execute has been resolved.

 Release Version: 50.60.0

Fix    EAM-1004: The issue in which a command/query was getting abruptly terminated has been resolved.

Release Version: 50.45.0

Change    EAM-911: The Ruby SAML version has been upgraded to the latest 1.7.2 version.  

Release Version: 50.39.0

New   AD-655: Custom EC2 tags will be visible in the cluster usage report. 

Release Version: 50.38.0

Change    EAM-857: Qubole Scheduler has introduced a new rerun limit per scheduled job and it is applicable to all scheduled jobs. Its default value is 20. When the scheduled job reruns exceed the limit, you get this error message:

A maximum of 20 reruns are allowed for scheduled job in your account: #<account_number>

You can increase the number of reruns for the QDS account by creating a ticket with Qubole Support. The new value is applicable to all the jobs of a given QDS account. 

Release Version: 50.29.0

Fix    HADTWO-1341: If the client running outside master tried to contact any daemon running on the master and if it failed the first time, then in later retries, it always fails to contact the master as it is used to do the wrong address resolution. This issue has been fixed by not doing the resolution on the client side always. 

Release Version: 50.27.0

Change    RUB-40: The RubiX checkbox on the Hadoop 2 (Hive) cluster UI has been removed. 

Release Version: 50.22.0

Fix    EAM-567: The issue in which disabling an account was unsuccessful, has been resolved. 

Fix    HADTWO-1302: The issue in which the ApplicationMaster tried to launch Containers on the downscaled nodes has been resolved. 

The issue has occurred when two cluster instances came up with the same private IP address. 

Release Version: 50.14.7

Fix    PRES-1572: The issue in which Presto queries taking longer time in the Planning Stage on a Presto 0.180 cluster and not getting submitted to the cluster has been resolved. 
Fix    PRES-1573: The errors in fetching AWS S3 credentials have been resolved. 

Release Version: 50.12.0

Fix    AN-755: Fixed an issue where users assigned to public hive role could not see tables in Explore and Analyze UI. 

 

 

 

 

Have more questions? Submit a request

Comments

Powered by Zendesk