The Mid Week News - 30/08/2017 edit  

Right, time for this weeks news…

Technology updates (details are on the relevant technology pages):

  • We’ve only just covered it, but Beam has already seen a point release to 2.1

Technology news:

  • A topical follow-up to our look at Spark Structured Streaming last week - Confluent have just announced KSQ, which provides the ability to create continuously updated tables and new streams from data in Kafka using SQL
  • And on the subject of KSQL, an interview with Confluent on KSQL from ZDNet
  • Part 2 from Cloudera on the role Analytical Search capabilities play in big data analytics
  • LinkedIn have opened sourced Cruise Control, their technology for monitoring, balancing and managing their Apache Kafka clusters
  • An update from The Register on Basho - it looks like Bet365 are planning to buy up Basho’s assets (including Riak), and open source all of their products
  • A follow up from Confluent on how to get started with Kafka Connect
  • A big dump from DataBricks on useful links and information relating to Spark Structured Streaming
  • Hue is gaining the ability to do ad-hoc imports to HDFS from databases via Sqoop in it’s next release (4.1), building on the existing functionality to be able to import flat files
  • A two part series from Hortonworks (part 1 and part 2) on doing Hive table updates, including how to do type 1, 2 and 3 slowly changing dimensions in Hive
  • A presentation from Gwen Shapira at Confluent (via InfoQ) on schema management and the role of schema management tools such as the Confluent Schema Registry (bundled with Confluent Open Source) and Hortonworks Schema Registry (/technologies/schema-registry/)
  • Thoughts from the MapR blog on how Businesses Can Cultivate a Data-Driven Culture. Normally I’d avoid these articles like the plague, but this one seems to keep it simple and the advice sensible.