Cloudera Altus edit
Platform for accessing individual CDH capabilities as services. Currently supports the deployment and management of CDH clusters on cloud infrastructure (Director, previously Cloudera Director), the execution of Spark, MapReduce or Hive over Spark or MapReduce jobs (Altus Data Engineering), the dynamic provisioning of Impala clusters (Altus Data Warehouse), with a stated future plan for R- and Python-based machine learning workloads (Altus Data Science) and an HBase based operational database service. Runs on Amazon Web Services or Microsoft Azure over external data in Amazon S3 or Azure Data Lake Storage, with a stated plan to expand support to other cloud service providers (specifically the Google Cloud Platform) in the future. Includes Altus SDX, allowing metadata (e.g. Hive table definitions) to be automatically persisted across transient workloads, referenced via a namespace. Supports a web based UI, a (Python) CLI and a Java SDK, with full user authentication and role based access management, and integration with AWS and Azure security. Launched in May 2017, with a per node / per hour pricing model. Technology Information
Other Names Altus Type Commercial Last Updated September 2018 Sub-projects
Cloudera Altus > Cloudera Altus Director Solution for deploying and managing Cloudera CDH Hadoop clusters on cloud infrastructure based on automatically provisioned infrastructure with Hadoop provisioned on top via Cloudera Manager. Includes out of the box support for Amazon Web Services, Microsoft Azure and Google Cloud Platform, with support for vSphere available from VMWare, with a Service Provider Interface (SPI) for adding support for new providers. Server component must be manually deployed via an RPM. Supports the ability to scale clusters up and down, clone clusters, run post deployment scripts, and create Kerberized and highly available clusters. Manageable through a web UI, a REST API (with Python and Java APIs) and a CLI. Released as Cloudera Director at 1.0 in October 2014 as part of Cloudera Enterprise 5.2, being renamed to Cloudera Altus Director in September 2018 as part of CDH 6. Free to download and use, with commercial support available as part of a Cloudera Enterprise subscription. Cloudera Altus > Cloudera Altus Data Engineering Managed service for the execution of Spark, MapReduce or Hive (over MapReduce or Spark) jobs using managed CDH clusters on AWS and Azure cloud infrastructure over data in Amazon S3 or Azure Data Lake Storage (ADLS). Jobs run on clusters within a defined AWS or Azure environment, which can be transient (created and terminated on demand) or persistent, with each cluster supporting one service type (Hive, Spark, MapReduce) with a fixed node count. Jobs can then be queued individually or in batch for execution against an existing cluster or against a dynamically created cluster, with jobs specified either by uploading a JAR to S3 (for Spark or MapReduce) or via a Hive script (either directly uploaded or uploaded to S3), and the ability to either halt or continue the queue on job failure. Supports access to clusters via SSH, read only access to Cloudera Manager, a SOCKS proxy to cluster web UIs (including the CM admin console, YARN history server and Spark history server), and access to server and workload logs (including the ability to write these to S3 for access after clusters have been terminated). All nodes managed by Altus are tagged with the cluster name and node role (master, worker or Cloudera Manager) and bootstrap scripts can be specified for execution on nodes after cluster startup. Cloudera Altus > Cloudera Altus Data Warehouse Impala as a managed service, supporting the dynamic provisionng of Impala clusters on AWS and Azure cloud infrastructure over data in Amazon S3 or Azure Data Lake Storage (ADLS). Clusters consist of a coordinator node and multiple worker nodes, with read-only access to a Cloudera Manager instance, with the node count fixed on creation. Supports JDBC and ODBC access to data, along with access to clusters via SSH, read only access to Cloudera Manager and a SOCKS proxy for access to the Impala web UIs. Previously known as Cloudera Altus Analytical DB. Related Technologies
Uses Cloudera CDH, Amazon Web Services, Microsoft Azure History
2017-05-24 Initial GA release (Data Engineering) - announcement; blog; details 2017-06-22 Addition of workload analytics - announcement 2017-09-27 Support for Azure added - announcement; blog 2017-11-28 Beta support for Analytical DB added - announcement 2018-03-06 Support for SDX added - announcement 2018-05-22 Data Engineering GA and Analytics DB beta on Azure - announcement 2018-07-24 Cloudera Altus SDX Beta - announcement 2018-08-02 Analytical DB renamed to Data Warehouse - announcement 2018-09-12 Cloudera Director added as Cloudera Altus Director Links
Blog Posts