The Week That Was - 21/04/2017

We’ve been a little bit all over the shop this week, but let’s try and summarise what we’ve looked at.

We started late having taken Monday off for Easter, with a look at Cloudera on Tuesday, closing out our review of them and their technologies. We then took another break on Wednesday to catchup on everything that’s changed in the technologies we’ve looked at to date, returning on Thursday with the start of our journey into MapR, the final Hadoop distribution we’re going to look at in detail. We’ve started by looking at their open source components, looking at the MapR Ecosystem Pack and Apache Drill

Read More

The Mid Week News - 19/04/2017

Right, I’ve been slack in getting this out there, which means we’ve built up a nasty backlog, but it’s time to talk about what’s changed since we originally wrote some of our technology summaries.

Read More

The Week That Was - 14/04/2017

It’s the Easter holidays here in the UK, so no technology summary today, but let’s recap the last week before we forget everything we looked at.

This week, we’ve been looking at the Cloudera’s closed source products - Cloudera Manager, their tool for creating and managing CDH hadoop clusters, Cloudera Navigator, a set of products for data management, data encryption and helping migrate SQL workloads to Hadoop, and Cloudera Director, for doing CDH Hadoop in the cloud.

Read More

The Week That Was - 31/03/2017

No technology summary today for various reasons, one of which is that I’m taking a break next week and we’ve probably got to a pretty good place to pause. We’ve finished looking at the new open source technologies in the Cloudera stack this week, with their proprietary closed source technologies to come, but let’s save those for a fresh week.

So what exactly have we looked at this week? We started by looking at Apache Sentry, Cloudera’s competitor to Apache Ranger and Cloudera Search, Cloudera’s competitor to Hortonworks’ HDP Search. We then looked at Apache Kudu, a structured data store that supports both updates and deletes by primary key as well as efficient analytical table scans, and RecordService, a new technology that’s still in beta that provides an API for tools (such as Spark and MapReduce) to access structured data in Hadoop with fine grained access control.

Read More

The Week That Was - 24/03/2017

So this week we started our journey into the Cloudera technology stack. I covered the final Hortonworks bits on Monday, but what have we looked at since then?

We started off by looking at Cloudera’s Hadoop distribution CDH and the technologies it bundles. We’ve covered a lot of these already (Hadoop being Hadoop there’s plenty of overlap between the various distributions), but there’s still plenty of new stuff here to keep us busy for a couple of weeks.

We then moved on to look at Llama, a small piece of open source technology created to support Impala running over YARN, Apache Whirr, a now retired Apache open source project for deploying a number of technologies onto cloud platforms, and Apache Impala, Cloudera’s SQL on Hadoop engine for low latency interactive queries.

Read More