Apache Arrow edit
In-memory data structure specification for building columnar based data systems. Provides a standard interchange format to allow sharing of data between processes on a node without the overhead of moving or transforming the data, permits O(1) random access and has the ability to represent both flat relational structures and complex hierarchical nested data. Data is organised using a columnar structure memory-layout making it cache efficient for analytical workloads (which typically group all data relevant to a column operation together) and allows execution engines to take advantage of modern CPU SIMD (Single Instruction Multiple Data) instructions which work on multiple data values simultaneously in a single CPU clock cycle. Supports Java, C, C++, JavaScript, Python, Go, Ruby and Rust. Seeded from the Apache Drill project and promoted directly to a top level Apache project in February 2016 followed by an initial 0.1 release in October 2016. Used in a range of other projects including Drill, Spark, Impala, Kudu, Pandas and others. Has not yet reached a v1.0 milestone, but is still under active development with a range of contributors from a number of other Apache and non-Apache data projects. http://git.apache.org/arrow.git/ - source codeTechnology Information
Other Names Arrow Vendors The Apache Software Foundation Type Commercial Open Source Last Updated July 2019 - v0.14 Release History
version release date release links release comment 0.8 2017-12-18 blog post; release notes 0.9 2018-03-21 blog post; release notes 0.10 2018-08-07 blog post; release notes 0.11 2018-10-09 blog post; release notes 0.12 2019-01-21 blog post; release notes 0.13 2019-04-02 blog post; release notes 0.14 2019-07-02 blog post Links
News