Apache Oozie edit  

Technology for managing workflows of jobs on Hadoop clusters. Primary concepts include workflows (a sequence of jobs modelled as a directed acyclic graph), coordinators (schedule the execution of workflows based on the time or the presence of data) and bundles (collections of coordinators), with all configuration specified in XML. Supports a range of technologies, including MapReduce, Pig, Hive, Sqoop, Spark, Java executables and shell scripts. Includes a server component, a metadata database for holding definitions and state (with support for a range of database technologies), a command line interface and a read only web interface for viewing the status of jobs. Also supports the parameterisation of workflows, the modelling of datasets (and the use of these to manage dependencies between workflows within coordinators), automatic retry and failure handling, and the ability to send job status notifications via HTTP or JMS. Open sourced by Yahoo in June 2010. Donated to the Apache Foundation in July 2011, graduating in August 2012. Commercial support available as part of most Hadoop distributions

Technology Information

Other NamesOozie
VendorsThe Apache Software Foundation
TypeCommercial Open Source
Last UpdatedApril 2018 - 5.0

Related Technologies

Is packaged byApache Bigtop, Hortonworks Data Platform, Cloudera CDH, MapR Expansion Pack, Amazon EMR

Release History

versionrelease daterelease linksrelease comment
5.02018-04-18summary; release notes 

News

  • Oozie release announcements only appear to be available via the Apache announcements mailing list

Blog Posts