Qubole Release Notes for Upcoming QDS-on-AWS Version R50

Qubole provides Qubole Data Service (QDS) on Amazon Web Service (AWS), Microsoft Azure, and Oracle Cloud Infrastructure (OCI). 

For details of what has changed in this version of QDS-on-AWS, see:

  • What is New in QDS-on-AWS
  • New Alpha/Beta Features in QDS-on-AWS
  • List of Changes and Bug Fixes in QDS-on-AWS
Note: This is a proposed set of release notes and may have some changes based on testing or other developments. 

 

What is New in QDS-on-AWS

New in AWS Cluster Management

These are the new enhancements in AWS clusters in this QDS release:

  • Spot Instance Requests will now have tags attached to them.
  • On-Demand nodes’ tagging and create operations are combined into one atomic operation now.
  • As an option, QDS allows a user to use a persistent security group and apply that SG to all clusters. This would avoid Qubole from creating an SG for each cluster. This feature can be used only with On-Demand nodes.
    This feature is not enabled by default. Create a ticket with Qubole Support to enable this on  your account.
  • QDS supports defining default account-level cluster tags. This can be set in Account Settings page of Control Panel. Users can define user-level EC2 tags. This feature is not enabled by default on a QDS account. Create a ticket with Qubole Support to enable this feature on your account. 
    Capturing values for user-level tags when a user logs in using SAML. The mapping of attribute name (in SAML response) and the tag name, must be defined in SAML provider active admin page.


    If the account-level default cluster tags and user-level EC2 tags exist, they would be pre      populated when a cluster is created from the UI. These are visible when a user creates a  cluster.
  • The Idle Cluster Timeout setting at the account level and cluster level can be configured in hours and minutes. It supports AWS per-second billing.

    This feature is not enabled by default. Create a ticket with Qubole Support  to enable this on the QDS account.

  • QDS supports AWS G3 and P3 instance families.

  • Users will not be allowed to run command from the Analyze page on a cluster to which you are denied access to start it. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

   

 

New Parameters for Server-side Encryption in Hadoop DistCp

 

The following parameters can be set for server-side encryption in Hadoop DistCp along with the other parameters:

  • s3ServerSideEncryption: It will enable encryption of data at the object level as S3 writes it to disk.
  • s3SSEAlgorithm: It is used for encryption. If you do not specify it but s3ServerSideEncryption is enabled, then AES256 algorithm is used by default. Valid values are AES256, SSE-KMS and SSE-C.
  • encryptionKey: If SSE-KMS or SSE-C is specified in the algorithm, then using this parameter, you can specify the key using which the data is encrypted. In case, the algorithm is SSE-KMS, the key is not mandatory as AWS KMS would be used. If algorithm is SSE-C, then specify the key else the job fails.

 

New Enhancements in Hive

These are the Hive enhancements in this release:

  • Metrics are added for:
    • The number of queries executed for each execution engine through HiveServer2 in Hive 2.1
    • The number of Hive operations waiting for compilation through HiveServer2 in Hive 2.1
    • The number of open/closed/abandoned sessions through HiveServer2 in Hive 2.1
    • Tracking the queries execution lifecycle times fired through HiveServer2 in Hive 2.1

         These metrics are available in HS2 UI and Ganglia.

  • QDS UI supports adding a personal hive bootstrap in Control Panel > Hive Bootstrap.  
  • Hive tables to which users do not have read access will not be visible in the QDS UI that is as part of query result or Hive metastore on Analyze/Explore.
  • HiveServer2 Metrics are now seen as part of the Datadog monitoring service. The HiveServer2 metrics are visible as part of Ganglia monitoring and from this release, they are also seen on the Datadog monitoring.

 

New Enhancements in Presto

These are the Presto enhancements in this release:

  • The Presto Datadog integration has been revamped and the following metrics are now reported to Datadog from Presto. The following metrics are collected and reported today with it:
    • Failed Query count
    • Finished Query count
    • Running Query count
    • Average planning time across queries
    • Maximum GC time across all nodes in the cluster
    • Maximum GC count across all nodes in the cluster
    • Number of worker nodes seen by Presto
    • Service Check of Presto Service
    • Read request rate in cluster
    • Number of failures in requests from master to slave nodes
  • The query.max-memory default value has been changed to 100TB. If you want to limit the maximum memory a query can use, you can override this to a lower value either by setting this as a Presto override or setting the query.max_memory session property.
  • Qubole Presto supports user overridden IAM roles. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

  • A query.max-execution-time configuration and query_max_execution_time session properties have been added. These two properties enforces a time limit on the query execution time. Unlike the existing query.max-run-time property which enforces time limit starting from query creation, these two properties only consider the time spent in the query execution phase.

  • The Postgres jar version has been downgraded back to 9.3-1102-jdbc41 in 0.180 to maintain compatibility with Redshift.

New Enhancements in Qubole Scheduler

These are the Qubole Scheduler enhancements in this release:

  • ACL permissions can be now be set for a schedule (Read, Update, Delete, Clone, or Manage permission). Based on the allowed/denied permission, any user can see the available options. The ACLs can be set through the Scheduler UI and Object Policy API.

Now, a user must have Scheduler and Command resources permissions (Control Panel > Manage Roles > Policy Actions) to see Schedule and Schedule Instances on the UI.

New Enhancements in Spark

These are the Qubole Spark enhancements in this release:

  • The default Spark version in the Spark cluster’s UI drop-down list is changed to 2.1-latest. Select the version carefully while creating new Spark clusters.
  • 36 hour timeout has been removed for running Spark Streaming Jobs from Analyze. This support is available for an alpha access.
  • Performance optimizations in the INSERT INTO/OVERWRITE flow for Hive tables in Spark. This feature allows spark to directly write files in final destination instead of writing to a temporary staging directory which improves  performance.

  • There is a change intended at users, who are using QuboleDBTap to access data stores in Spark.

    The QuboleDBTap class and companion object has been copied from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap for Spark 2.0.0 and later versions.

    com.qubole.QuboleDBTap will still be maintained to keep backward compatibility for all existing versions of Spark. However, Qubole strongly recommends migrating from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap as the support for com.qubole.QuboleDBTap will be removed starting from Spark 2.3.0. QuboleDBTap and its methods can only be used by importing org.apache.spark.sql.qubole.QuboleDBTap.

 

New Enhancements in Notebooks

These are the Notebooks' enhancements in this release:

  • QDS provides Scheduler support on the Dashboards UI. Notebooks Dashboards are refreshable by setting the interval and selecting refresh periodically.
  • A new option to configure Notebook Dashboards has been added.
  • You can use the TAB key to autocomplete non-markdown paragraphs in a notebook.
  • QDS displays the opened Dashboard name in the browser tab.
  • Simplified Spark interpreter properties by removing unnecessary default interpreter properties are enabled by default. This feature is reflected only on new Spark interpreters.
  • QDS maintains a notebook’s history when it is run from the Analyze and Scheduler UI.
  • A user can upload or download a file to/from an S3 location through the Analyze and Notebooks page. This feature can be used with:
    • Permission to download and upload an S3 file that can be controlled through granular permission model
    • An ability to hide object store widget for certain users
  • QDS now supports autocomplete for SQL. Users can see suggestions for Hive keywords, functions and table names in the default schema. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
  • A user can rename an open notebook or a dashboard from its object name.

New Enhancements in Dashboards

  • Dashboards are generally available with this release.
  • A user can change the Dashboard theme.
  • A user can rename an open notebook or a dashboard from its UI header.
  • QDS provides Scheduler support on the Dashboards UI.

 

UI Enhancements

These are the UI enhancements in this release:

  • You can disable/enable the Zendesk ticket creation on an account basis.
  • The Getting Started Splash Screen has been added on the QDS UI.
  • The navigation to the main pages that is the drop-down of the QDS UI main page along with the UI skin color of the top horizontal bar have changed in this release. 
  • A new policy action show_token for a system/user role has been added in the Control Panel > Manage Roles which controls the visibility of the API Token column in the Control Panel > My Accounts page.
  • On the Control Panel UI >  Account Settings page, the storage/compute and settings section are under a single section called Access Settings and the Access Mode is moved from Account Settings to Access Settings. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

 

 

New Alpha/Beta Features in QDS-on-AWS

Handling Spot Node Loss on Spark Clusters

Spot Node loss is handled more gracefully on Spark versions 2.1.0, 2.1.1, and 2.2.0.

Instead of relying on the heartbeat to figure out a node failure, Qubole proactively figures out nodes undergoing Spot loss and stops scheduling tasks on the corresponding executors.

This feature is available for a beta access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

Running Spark Streaming Job from the QDS UI and API

Running Spark Streaming Job from Analyze UI or Commands API for more than 36 hour is now supported. This feature is available for an alpha access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

 

Qubole Introduces Deep Learning Clusters

Qubole introduces a Deep Learning cluster type.

Deep Learning clusters are mostly used to run python code snippets using Deep Learning packages such as Tensorflow, mxnet, and keras. Qubole supports distributed Deep Learning, which runs Tensorflow on Spark. In Deep Learning notebooks, Qubole uses the Spark interpreter group with the pyspark interpreter set as the default interpreter. Only GPU instances are supported on Deep Learning clusters.

This feature is available for a beta access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

 

Qubole Supports Hadoop 2.8

Qubole supports Hadoop 2.8 on clusters with this release.

This feature is available for a beta access. To get this feature enabled on your QDS account, create a ticket with Qubole Support.

List of Changes and Bug Fixes in Qubole-on-AWS

AWS CLUSTER MANAGEMENT


New Features


New   ACM-1530: Spot Instance Requests will now have tags attached to them.
New   ACM-1592: On-Demand nodes’ tagging and create operations are combined into one atomic operation now.
New  ACM-1612: As an option, QDS allows a user to use a persistent security group and apply that SG to all clusters. This would avoid Qubole from creating an SG for each cluster.
This feature can be used only with On-Demand nodes.

This feature is not enabled by default. Create a ticket with Qubole Support to enable this on your account.

New   ACM-1716: For an active heterogeneous cluster, the number of minimum and maximum nodes are shown on the UI.

New   ACM-1900: The Idle Cluster Timeout setting at the account level and cluster level can be configured in hours and minutes. It supports AWS per-second billing. 

This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
New   AD-325: Users can create read-only default cluster tags in in the Control Panel > Account Settings page and SAML Superadmin page, which will be non-editable when a new cluster is created.


New    MW-1109: QDS supports defining default account-level cluster tags. This can be set in Account Settings page of Control Panel. Users can define user-level EC2 tags. This feature is not enabled by default on a QDS account. Create a ticket with Qubole Support to enable this feature on your account. 
Capturing values for user-level tags when a user logs in using SAML. The mapping of attribute name (in SAML response) and the tag name, must be defined in SAML provider active admin page. 

If the account-level default cluster tags and user-level EC2 tags exist, they would be pre              populated when a cluster is created from the UI. These are visible when a user creates a              cluster.


Bug Fixes

Fix    ACM-1746: Tagging of volumes which used to fail in heterogeneous clusters works now.
Fix    ACM-1788: Fixed auto-scaling logs to display a warning only when needed.
Fix    ACM-1798: Deleted subnets are shown as an error condition on the UI and it will now be possible to update a cluster's configuration in such a situation.

Fix    ACM-1846: The issue in which a cluster that had read permission denied was still visible to the user in the Clusters UI page has been resolved now.
Fix   ACM-1849: Cluster usage report API will not be accessible for clusters without the read permission. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Fix   ACM-1856: Enhancements have been done on the QDS platform to reduce the CPU utilization.

Fix    QBOL-6183: During the cleanup task, the clusters that are inactive and to be terminated are first identified. In the time between processing and terminating it, a new command may get added up to the cluster. Currently, as the cluster is being labelled as to be terminated, it gets killed.
To fix this issue, before terminating a cluster, checking if any new session has been added to it. If yes, do not terminate.

DEEP LEARNING

Enhancements

Change    DS-24: pyspark is the default interpreter for Deep learning notebooks as opposed to scala for the spark interpreter group.

Change   ZEP-1326: New cluster type Deep Learning has been added on the QDS UI.

HADOOP 2

Bug Fixes

Fix    HADTWO-358: This fix binds the hadoop daemons such as Resource Manager, NameNode, timeline server and Job History Server to 0.0.0.0. Earlier, these daemons were bound to master DNS which caused a problem when customer attached an elastic IP address to the private IP address.

Fix    HADTWO-1016: The following parameters can be set for server-side encryption along with the other parameters:

  • s3ServerSideEncryption: It will enable encryption of data at the object level as S3 writes it to disk.
  • s3SSEAlgorithm: It is used for encryption. If you do not specify it but s3ServerSideEncryption is enabled, then AES256 algorithm is used by default. Valid values are AES256, SSE-KMS and SSE-C.
  • encryptionKey: If SSE-KMS or SSE-C is specified in the algorithm, then using this parameter, you can specify the key using which the data is encrypted. In case, the algorithm is SSE-KMS, the key is not mandatory as AWS KMS would be used. If algorithm is SSE-C, then specify the key else the job fails.

Fix    HADTWO-1134: The issue in which the S3AInputstream closes the http connection instead of releasing it back to connection pool has been fixed.

Fix    HADTWO-1166: While removing replicas from some nodes when their number is greater than maximum replication factor, a replica was removed from an On-Demand node without checking if it is the only On-Demand node containing the replica. This issue has been resolved.
Fix    HADTWO-1197: The Application UI log files rendering failure - intermittent issue has been resolved.

Enhancements

Change    HADTWO-1091: The S3A filesystem can now automatically detect the endpoint of an S3 bucket and users do not have to explicitly set endpoint in the configuration. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

After this feature is enabled, the S3a filesystem would not honor endpoint-specific properties such as fs.s3a.endpoint, qubole.s3.standard.endpoint, and fs.s3.awsBucketToRegionMapping.
Change    HADTWO-1189: QDS now supports setting configurable time-to-live (TTL) in JVMs (launched on all cluster types except Airflow and Presto clusters) for DNS lookups. Recommended TTL is 60 seconds. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Change    HADTWO-1190: Fix to make autoscaling more robust in case of network failures when connecting to the NameNode.

Change    HADTWO-1214: Occasional stream closed errors when loading jars have been fixed.

 

HIVE

Bug Fixes

Fix    HIVE-2617: HiveServer2 (HS2) script changes to use Java8 and G1GC in Hive2.1. Create a ticket with Qubole Support to enable this on your account.

Fix    HIVE-2629: The workflow that was stuck for a long time after each query completion has been resolved.
Fix    HIVE-2666: The issue in which windowing functions that were not failing with an invalid function error has been resolved.

 

Enhancements

AD-316: QDS UI supports adding a personal hive bootstrap in Control Panel > Hive Bootstrap.
AN-447: Hive tables to which users do not have read access will not be visible in the QDS UI that is as part of query result or Hive metastore on Analyze/Explore.

Change    HIVE-1931: An HS2 thrift port can be configured using hs2_thrift_port parameter on the Hadoop 2 (Hive) cluster to configure HiveServer2 Port. It is supported only on API.
Change    HIVE-2605: Metrics are added for the number of queries executed for each execution engine through HiveServer2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.
Change    HIVE-2608: Metrics are added for the number of Hive operations waiting for compilation through HiveServer2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.
Change    HIVE-2615: Metrics are added for the number of open/closed/abandoned sessions in HS2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.

Change    HIVE-2625: HiveServer2 is supported on small cluster instance types.
Change   HIVE-2630: Metrics are added for tracking the queries execution lifecycle times fired through HiveServer2 in Hive 2.1. These metrics are available in HS2 UI and Ganglia.

Change    HIVE-2676: HiveServer2 Metrics are now seen as part of the Datadog monitoring service.


PRESTO

New Features

New    PRES-893: Qubole Presto supports user overridden IAM roles. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.

New    PRES-1103: The Presto Datadog integration has been revamped and the following metrics are now reported to Datadog from Presto. The following metrics are collected and reported today with it:

  • Failed Query count
  • Finished Query count
  • Running Query count
  • Average planning time across queries
  • Maximum GC time across all nodes in the cluster
  • Maximum GC count across all nodes in the cluster
  • Number of worker nodes seen by Presto
  • Service Check of Presto Service
  • Read request rate in cluster
  • Number of failures in requests from master to slave nodes

 

 

Bug Fixes

Fix    PRES-1235: Fixed the predicate pushdown on integer columns in a Parquet format.
Fix    PRES-1302: Support camel-cased field names in the Parquet data.

Fix   PRES-1426: Presto 0.119 will not be available from this QDS release. It was deprecated and support was stopped earlier for this version. All existing clusters with Presto 0.119 will start with Presto 0.157 on a cluster restart.

Fix  PRES-1462: The Presto 0.180 UI must now display correct values for Rows/Sec, Bytes/Sec, and Parallelism instead of NaNs.

Fix  UI-4842: QDS notifies with warnings while overriding configuration through Presto overrides.

 

Enhancements


Change    PRES-735: The default value of query.max-memory has been changed to 100TB. If you want to limit the maximum memory a query can use, you can override this to a lower value either by setting this as a Presto override or setting the query.max-memory session property.

Change    PRES-1356: A query.max-execution-time configuration and query_max_execution_time session properties have been added. These two properties enforces a time limit on the query execution time. Unlike the existing query.max-run-time property which enforces time limit starting from query creation, these two properties only consider the time spent in the query execution phase.
Change    PRES-1366: The AWS Spot Termination Notification is received by Presto version 0.180 and the node that is going to be taken away is removed from active nodes' list to prevent any more tasks getting scheduled on it.
Change    PRES-1371: The Postgres jar version has been downgraded back to 9.3-1102-jdbc41 in 0.180 to maintain compatibility with Redshift.


QDS

New Features

New    EAM-339: You can disable/enable the Zendesk ticket creation on an account basis.
New    GROW-3: The Getting Started Splash Screen has been added on the QDS UI.

New    AD-99: A new policy action show_token for a system/user role has been added in the Control Panel > Manage Roles which controls the visibility of the API Token column in the Control Panel > My Accounts page.
New    AD-292: For a new user, who signs up on QDS, the default location is now editable by default.
 

Bug Fixes

Fix    AD-75: On the Control Panel UI > Account Settings page, the storage/compute and settings section are under a single section called Access Settings and the Access Mode is moved from Account Settings to Access Settings. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Fix    EAM-449: The SAML Login issue that was faced by a few users has been resolved.
Fix    EAM-457: Deprecating the remove PUT request and replacing it with DELETE request in the Command Templates API.
Example: PUT /api/v1.2/command_templates/<id>/remove is removed.

Fix    MW-1327: A role can be assigned to users through the user names. User email IDs can be used to add/delete users to/from a group through the Group API.

Enhancements

Change   AD-285: These additional metrics will be used to calculate CPU usage in all command reports:

  • org.apache.tez.common.counters.DAGCounter.AM_CPU_MILLISECONDS
  • Presto Counters.CPU_TIME

Change   EAM-456: In the viewing command history API, the per_page parameter has been modified. Its value is an integer and it is the number of commands to be retrieved per page. Its maximum value can be 100. Retrieve the next 100 commands based on the last command ID for a given QDS account.
Change   MW-1168: On the Analyze UI, you can compose a query or specify a query path for Redshift type of query.
Change   UI-5383: The navigation to the main pages that is the drop-down of the QDS UI main page along with the UI skin color of the top horizontal bar have changed in this release.

 

SCHEDULER

New Features

New    EAM-533: ACL permissions can be now be set for a schedule (Read, Update, Delete, Clone, or Manage permission). Based on the allowed/denied permission, any user can see the available options. The ACLs can be set through the Scheduler UI and Object Policy API.

Bug Fixes

Fix    

Enhancements

Change    EAM-286: Create Public Preview Query API has been added to a new Scheduler API call. After entering the parameters for a command, you can view the preview of it by using the endpoint https://<QDS env>/api/v1.2/scheduler/preview_macro. Where QDS can be an endpoint of QDS-on-AWS. For more information, see Supported Qubole Endpoints.
Change    EAM-376: A user can set command timeout while creating scheduler using API/UI. It must be less than 36 hours.

Change    EAM-466: Now, a user must have Scheduler and Command resources permissions (Roles > Policy Actions) to see Schedule and Schedule Instances on the UI.
Change    SCHED-142: The issue in which the scheduler notifications used to have 10-12 minutes delivery time has been resolved. Now, the notification should be delivered within 5 minutes.

SPARK

New Features

New    SPAR-2081: Performance optimizations in the INSERT INTO/OVERWRITE flow for Hive tables in Spark. This feature allows spark to directly write files in final destination instead of writing to a temporary staging directory which improves performance.

 

Backported Fixes/Changes from Open Source

Fix    SPAR-2177: Back ported from open source JIRA SPARK-16628.
Before Hive 2.0, ORC File schema has invalid column names like _col1 and _col2. This is a well-known limitation and there are several Apache Spark issues with spark.sql.hive.convertMetastoreOrc=true. This fix ignores ORC File schema and use Spark schema.

Bug Fixes

Fix    SPAR-1487: SparkSQL used to throw errors and bail out if the Hive table’s location is not the same as the hive staging dir automatically chosen by QDS. We have fixed that problem and automatically switch to using a staging dir which is on the same object store bucket as the table’s location.

Fix    SPAR-1640: Spark Hive Authorization: The issue with trying to drop a database/table that already exists has been resolved.
Fix    SPAR-1656: Spark Hive Authorization: With this fix, Spark will now honour authorization while creating and selecting from views. A user will be allowed to create views only from the source tables that the user has select privileges on. Similarly, a user will be allowed to select from only those views that he/she has select privileges on.
Fix    SPAR-1658: Spark Hive Authorization: With this fix, Spark will now honour authorization while altering views. A user will be able to alter a view only if he/she has ownership privileges on that view.
Fix    SPAR-1719: Customers who have added hive.security.authorization.enabled=true in their Spark cluster’s Hadoop override configurations will not face any issues while running Spark commands now.

Fix    SPAR-1930: Fixed the issue where a Spark command failed when semicolon is present in query comments.

Fix    SPAR-1996: Updating the spark-avro version in Qubole-Spark 2.2.0 to 4.0.0.

Fix    SPAR-2011: The SparkEventLog listener is used to write events generated by Spark to an event log file, used later by Spark History Server. These writes are synchronous in nature and done on a file located on HDFS. When there are too many events being generated from a Spark application or HDFS becomes slow, we end up in situations where these events start to get dropped because of event queue getting full. With this fix, events are written to a local file which can sustain much higher speed to events and sync them to HDFS asynchronously.
Fix    SPAR-2070: The issue in which the Spark History Server was failing to list running Spark applications has been resolved.

SPAR-2078: Proper formatting is done in the Spark History Server response in case the application that is looked for does not exist.

Fix    SPAR-2132: Fixed the problem with Spark's optimised S3 listing when it was dealing with empty folders.

 

 

Enhancements

Change    SPAR-1934: When a new Spark cluster is created in QDS UI, the Spark version chosen by default used to be 1.6.1 and this is now changed it to 2.1-latest.
Change    SPAR-2001 and SPAR-2154: Spot Node loss is handled more gracefully on Spark versions 2.1.0, 2.1.1, and 2.2.0.

Instead of relying on the heartbeat to figure out a node failure, Qubole proactively figures out nodes undergoing Spot loss and stops scheduling tasks on the corresponding executors.

This feature is available for a beta access. Create a ticket with Qubole Support to enable this on your account.
Change    SPAR-1796: Remove the 36-hour timeout limitation when running streaming applications from the Analyze command editor.

Change    SPAR-2027: Spark metrics can be seen on the Datadog and Ganglia Monitoring service.

Change    SPAR-2152: There is a change intended at users, who are using QuboleDBTap to access data stores in Spark.

The QuboleDBTap class and companion object has been copied from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap for Spark 2.0.0 and later versions.

com.qubole.QuboleDBTap will still be maintained to keep backward compatibility for all existing versions of Spark. However, Qubole strongly recommends migrating from com.qubole.QuboleDBTap to org.apache.spark.sql.qubole.QuboleDBTap as the support for com.qubole.QuboleDBTap will be removed starting from Spark 2.3.0. QuboleDBTap and its methods can only be used by importing org.apache.spark.sql.qubole.QuboleDBTap.

STREAMX

Enhancements

Change    SX-60: We have now added support to establish a secure connection between StreamX and Kafka clusters. StreamX cluster can now connect to your SSL enabled Kafka clusters.

TEZ

Bug Fixes

Fix    QTEZ-211: In the Tez mode, dynamic partitioning query with UNION ALL fails to insert to a partitioned table at moveTask with Invalid partition key and values error due to an additional subdirectory getting created at temp location. This issue has been fixed and ensured that this additional sub directory does not get created.

 

ZEPPELIN/NOTEBOOKS

New Features

New    MW-696 and ZEP-1488: QDS provides Scheduler support on the Dashboards UI. Notebooks Dashboard are refreshable by setting the interval and selecting refresh periodically. 

New    ZEP-911: QDS maintains a notebook’s history when it is run from the Analyze and Scheduler UI.
New    ZEP-1434 and ZEP-1297: A user can change the Dashboard theme.
New    ZEP-1562: QDS now supports autocomplete for SQL. Users can see suggestions for Hive keywords, functions and table names in the default schema. This feature is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
New    ZEP-1608: A new option to configure Notebook Dashboards has been added.

 

Bug Fixes

Fix    ZEP-39: A detailed progress of Spark paragraphs run on notebooks, will be offered when zeppelin.enable_detailed_spark_progress is set to true. This feature is enabled by default.

ZEP-581: Fixed the issue in which it was unable to highlight the comments and code formatting for paragraphs in pyspark.
ZEP-601: Fixed the issue in which pyspark syntax highlighting was not working (instead it seemed to use scala).

Fix    ZEP-929: You can now use z.showplot(plotObj) to directly plot matplotlib graphs instead of defining the show function explicitly.
Fix    ZEP-979: The issue in which the Spark cluster displayed an error, Pyspark is not responding when pyspark was run for the first time after cluster start, has been resolved.

Fix    ZEP-1237: A user can upload or download a file to/from an S3 location through the Analyze and Notebooks page. This feature can be used with:
Permission to download and upload an S3 file that can be controlled through granular permission model
An ability to hide object store widget for certain users

Fix    ZEP-1242: A user can delete paragraphs in a paragraph until just one paragraph would remain in the notebook. When there is just one paragraph in the notebook, the Remove button will be unavailable.
Fix    ZEP-1464: Fixed the corruption of interpreter JSON due to forceful restarts of the Zeppelin server at the time of initialization.

Fix    ZEP-1524: Exceptions appearing in Zeppelin logs related to Zeppelin not being able to handle events from the remote interpreter when a Spark app is started using Zeppelin are fixed. Although this issue did not have any impact on the functionality, the logs were being flooded with exceptions in the above mentioned scenario.
Fix    ZEP-1626: The issue in which when a notebook is moved to a different cluster, user on an edge-case may sometime see their notebook contents getting reverted to the contents present on the older cluster, has been resolved.

Fix    ZEP-1643: Fixed the issue where a SQL table was not rendering properly when a column had new line character in it. This fix is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Fix    ZEP-1715: Invalid strings in a paragraph are highlighted with a better color.
Fix    ZEP-1718: The issue in which downloaded results in CSV and TSV format were different when compared with Notebook graph data has been fixed.
Fix    ZEP-1895: Fixed the failure in a notebook that is run with empty paragraph from Notebooks UI as well as via the Notebook command when pyspark is selected as the default interpreter.

 

Enhancements

Change    ZEP-405: You can use the TAB key to autocomplete non-markdown paragraphs in a notebook.

Change    ZEP-1181: A user can rename an open notebook or a dashboard from its object name.
Change    ZEP-1297: Improved Dashboard mode visuals.
Change    ZEP-1406: A new package environment gets created automatically when a new Spark cluster is created.
Change    ZEP-1569: The user can now see the list of all pre-installed and user packages from the new Environments UI.
Change    ZEP-1607: QDS displays the opened Dashboard name in the browser tab.

Change    ZEP-1658: The user can select only the available frequency values while scheduling a dashboard, which is based on the maximum allowed scheduler instances per day.
Change    ZEP-1670: Cloning a Spark cluster clones the default package environment attached to the parent cluster.
Change    ZEP-1681: The tooltip about viewing other interpreters on the Interpreter page has been enhanced.
Change    ZEP-1686: User are notified with a message -
Spark Context coming up. Can't cancel now. if spark context is coming up while cancelling a paragraph in notebook. This enhancement is not enabled by default. Create a ticket with Qubole Support to enable this on the QDS account.
Change    ZEP-1700: A user can view a notebook/dashboard information by hovering over the info icon on the side panel.
Change    ZEP-1714: The Notebook UI’s right-sidebar is resizable now.
Change    ZEP-1756: Visual enhancements have been done on Notebooks and Dashboards UI.
Change    ZEP-1768: Creating a Spark cluster creates a default environment. The parameters for the environment can be specified under Advanced Configuration in the cluster’s UI page.
Change    ZEP-1772: Simplified Spark interpreter properties by removing unnecessary default interpreter properties. This feature is reflected only on new Spark interpreters and is enabled by default.
Change    ZEP-1823: Dashboards are generally available with this release.


List of Hotfixes in QDS-on-AWS Since 24th October 2017


Fix    AD-263: The Same as Default Compute sub option is selected by default on a Cluster’s EC2 Settings UI.

Fix    AN-572: Fixed an issue that appeared while filtering cluster labels. They are now matched exactly.

Fix    EAM-628: The issue in which a cancelled Spark Job that was still running for 34 hours until the cluster down, has been resolved.

Fix    AN-560: The JDBC driver charset error has been resolved.

Fix    HADTWO-839: Upgraded the AWS SDK version used by the S3A file system to 1.10.77 from 1.10.6.
Fix    HADTWO-1088: Retry object listing in the s3a file system if it fails due to XML parsing.

New   HADTWO-1112: As part of QDS per-second scaling enhancements a node cool down time period has been added for downscaling in YARN-based clusters. This feature is available for a beta access. Create a ticket with Qubole Support to enable it on your account.

Fix    HADTWO-1152: In YARN autoscaling, HDFS decommission used to started on all nodes that are in YARN Graceful Shutdown (GS) state. As a result of it, sometimes more than required nodes were downscaled.
With this fix, number of nodes that can be decommissioned from HDFS is calculated and HDFS decommission is started only on those many nodes in YARN GS.
Fix    HADTWO-1153: This issues fixes the calculation of number of nodes that can be decommissioned from HDFS while downscaling in YARN autoscaling.

Change    HADTWO-1144: Ported fix for MAPREDUCE-6154.
Fix    HADTWO-1186: The issue in which a Spark job failed because of NodeManager starting twice, has been resolved.
Change    HADTWO-1212: Ported HADOOP-13164 to avoid object listing during output stream close in the s3a file system. In addition, maximum retries in object listing are made configurable. By default the maximum retries are 10.

Fix    PRES-1300: Parquet Predicate pushdown logic is fixed to honour the data type defined in Hive schema.

Fix    PRES-1462: The Presto 0.180 UI must now show correct values for Rows/Sec, Bytes/Sec, and Parallelism , instead of NaNs.
Fix    PRES-1471: Change the v1/node api of presto master to skip reporting exceptions that cannot be serialized.
Fix    PRES-1407: The Parquet performance deterioration between Presto 0.157 and0.180 versions has been resolved.
Fix    SPAR-1927: The Spark UI URL issue that led to 500 error has been resolved.
Change    SPAR-2038: Node downscaling depends on the Spark shuffle service not having any shuffle data for active Spark applications. But this check was broken and nodes used to remain alive even after no shuffle data was actually present. So nodes used to stay up for longer duration than necessary. This was a problem only in 2.x.y versions of Spark. These issue have been fixed in all the affected Spark versions.
New   ZEP-1017: The scroll behavior for the paragraph output has been added.
Fix    ZEP-1328: Lost interpreter and customizations have been restored.
Fix    ZEP-1465: It is a bug-fix for a corner case where cancelling a paragraph in a notebook could lead to the subsequent run of that paragraph being aborted automatically.
Change    ZEP-1685: The interpreter.json that is downloaded from the object store is validated during a cluster start to catch incomplete file downloads.

 

 

 

Have more questions? Submit a request

Comments

Powered by Zendesk