Apache Oozie edit
Technology for managing workflows of jobs on Hadoop clusters. Primary concepts include workflows (a sequence of jobs modelled as a directed acyclic graph), coordinators (schedule the execution of workflows based on the time or the presence of data) and bundles (collections of coordinators), with all configuration specified in XML. Supports a range of technologies, including MapReduce, Pig, Hive, Sqoop, Spark, Java executables and shell scripts. Includes a server component, a metadata database for holding definitions and state (with support for a range of database technologies), a command line interface and a read only web interface for viewing the status of jobs. Also supports the parameterisation of workflows, the modelling of datasets (and the use of these to manage dependencies between workflows within coordinators), automatic retry and failure handling, and the ability to send job status notifications via HTTP or JMS. Open sourced by Yahoo in June 2010. Donated to the Apache Foundation in July 2011, graduating in August 2012. Commercial support available as part of most Hadoop distributions Technology Information
Other Names Oozie Vendors The Apache Software Foundation Type Commercial Open Source Last Updated April 2018 - 5.0 Related Technologies
Is packaged by Apache Bigtop, Hortonworks Data Platform, Cloudera CDH, MapR Expansion Pack, Amazon EMR Release History
version release date release links release comment 5.0 2018-04-18 summary; release notes Links
News
Blog Posts