The Mid Week News - 14/06/2017 edit
Right, let’s try and do this a bit more regularly (although it’s a bit late today!), especially as it seems to have been a busy news week…
Technology updates (details are on the relevant technology pages):
- Hortonworks Data Flow has seen a 3.0 release, with the biggest changes being the introduction of two new products - Streaming Analytics Manager and Schema Registry - and a technical preview of SAM Stream Insights which bundles Druid and Apache Superset. We’ll talk more about this on Friday!
- As part of it’s 2.6 release, Hortonworks Data Platform has deprecated a bunch of technologies that will be removed in HDP 3.0, including Falcon, Flume, Mahout, Slider and Hue, and is moving Accumulo, Kafka and Storm out of HDP into other Hortonworks products. I’ll try and capture my thoughts on Friday.
- Apache Falcon now appears to be inactive, probably related to it’s deprecation from HDP
- Apache Slider now also appears to be inactive, with a plan to fold support for long running services into YARN
- Apache NiFi continues it’s breakneck release schedule with a 1.3 release
- Apache Solr has seen a bump to 6.6
- Alluxio has seen a 1.5 release, although details seem to be thin on the ground at the moment
- Hortonworks Data Cloud for AWS has skipped 1.15 and gone straight to 1.16
- Cloudbreak, Hortonworks’ Hadoop in the cloud orchestration tool, has jumped to 1.14
- ZepplinHub (the Apache Zeppelin managed service) has changed it’s name to Zepl
- Livy has been donated to the Apache Foundation
Technology news:
- Hortonworks and IBM have announced a partnership agreement, whereby IBM will distribute HDP as its official Hadoop product, and Hortonworks will resell IBM’s Data Science Experience (DSX) and BigSQL. Hortonworks now also certify HDP to run on IBM Spectrum Scale. Good summary from ZDNet here. Come back on Friday for some of my random thoughts.
- Hortonworks have announced a new flex support subscription for HDP that covers the usage of HDP on-premise, on IaaS, when deployed using Cloudbreak, or when used as HDCloud on AWS.
- Cloudera have a summary of how to tune the memory usage of Apache Solr
- On the subject of Solr, see this article for information on Solrmeter (a tool for testing Solr performance under heavy load)
- An update from Yahoo on Accordian, an update to Apache HBase to improve performance by doing more work in memory.
- Databricks have announced Databricks Serverless, a fully managed Databricks (built on Apache Spark) service that manages it’s own (virtual) infrastructure