The Mid Week News 15/08/2018 edit
So we’re back, and given it’s been three weeks, there’s a bumper load of news this week…
Technology updates (details are on the relevant technology pages):
- Apache Arrow has hit 0.10
- Apache Beam is up to 2.6
- Apache Drill is up to 1.14
- Apache Flink is now 2.6
- Apache Hive is up to 3.1
- Apache Kafka has hit 2.0
- Apache Knox is up to 1.1
- Confluent Open Source and Enterprise have hit 5.0
- Greenplum has hit 5.10
- Streamsets Data Collector has hit 3.4
- Cloudera Altus Analytical DB is now Cloudera Altus Data Warehouse
Other technology news:
- From DataNami - storage, and how the future of storage is software - link
- From Hortonworks - GPU support in Hadoop 3.1 - link
- From ZDNet - advances in self tuning data structures - link
- Cloudera have a three part blog on how to roll your own schema management tool for Kafka using a Kafka topic to store your schemas - part1; part2; part3
- From Datnami - a view on parallel graph databases (which we list under our Analytical Databases) category - link
- Elastic have issues a security warning for Elastic 6.3.0 and 6.3.1 which “May Disable Security for Trial Licenses” - link
- Hortonworks financial results are out - Dataname; The Register
- GridGain have a two part blog on how to use Apache Ignite to accelerate Spark - part1; oart2
- DLab has been submitted to the Apache Incubator - “a platform for creating self-service, exploratory data science environments in the cloud using best-of-breed data science tools” - proposal
- An in depth piece from Elastic on designing and sizing Elasticsearch clusters, with a specific focus on Elastic Cloud - link
- This is always fun - thoughts from Microsoft on running a 50 thousand node Hadoop cluster - link
- Amazon Redshift Spectrum now supports querying nested data structured - link
- Apache Kafka has a couple of security announcements - CVE-2018-1288 and CVE-2017-12610