The Mid Week News 18/09/2019 edit  

Right - news time again.

Remember, you can get daily news updates from our twitter feed (@OnDataEng)…

Technology updates (details are on the relevant technology pages):

  • Apache Calcite 1.21 is out - you might never have heard of it, but it’s probably being used by many of the data tools you use on a daily basis for query parsing and optimization - link

Other technology news:

  • Cloudera have announced the release of Cloudera Streams Management - bundling their Kafka management console and replication tool - link
  • Interested in Apache Samza, hear about LinkedIn’s journey with it - link
  • From the ever reliable The Morning Papers - Procella, YouTube’s unified OLTP/OLAP (HTAP) database - link
  • InfluxDBCloud (their time series database as a service on AWS, Azure and GCP) has hit 2.0 and has now gone serverless - link
  • Google Cloud DataProc now supports (in alpha) running your Spark jobs on Kubernetes (GKE) rather than YARN VMs - link
  • More on on Google Cloud DataProc Spark on GKE from ZDNet - link
  • Qubole Data Service is now available on Google Cloud Platform if you’re looking for a cloud agnostic Hadoop as a service offering (that runs on Google Cloud) - link
  • Interested in replicating data between Kafka clusters - Cloudera have a post on on MirrorMaker 2 which is based on Kafka Connect - link