Apache Impala edit  

An MPP query engine that supports the execution of SQL queries over in HDFS, HBase, Kudu and S3 based on tables defined in the Hive Metastore. Focus is on analytical (OLAP) use cases, and more specifically on low latency interactive queries (rather than long running batch queries), with some support for batch inserts of data. Supports DDL statements for updating the Hive Metastore, uses (broadly) the same SQL syntax as Hive (including UDFs and a range of aggregate and analytical functions), as well as the same JDBC / ODBC drivers, and is therefore compatible with any Hive query tool (such as Beeline). Supports querying over data in Parquet, Text, Avro, RCFile and SequenceFile formats, with the ability to write Parquet and Text data. Support Kerberos and LDAP authentication, and integration with Apache Sentry for authorisation. Includes a shell (Impala Shell) that supports some shell only commands for tuning performance and diagnosing problems. Created by Cloudera, started in May 2011 and first announced in October 2012, with a 1.0 GA release in May 2013. Donated to the Apache Foundation in December 2015, graduating in November 2017, and is still under active development.

Technology Information

Other NamesImpala
VendorsThe Apache Software Foundation
TypeCommercial Open Source
Last UpdatedApril 2019 - v3.2

Related Technologies

Is packaged byCloudera CDH, MapR Expansion Pack, Cloudera Altus Data Warehouse

Release History

versionrelease daterelease linksrelease comment
2.92017-06-17changelog 
2.102017-09-15changelog 
2.112018-01-18changelog 
2.122018-05-01changelog 
3.02018-05-09changelog 
3.12018-12-06changelog 
3.22019-03-29changelog 

Blog Posts