StreamSets Data Collector edit
General purpose technology for the movement of data between systems, including the ingestion of batch and streaming data into an analytical platform. Pipelines are configured in a graphical user interface, and consist of a single origin, one or more processor stages and then one or more destinations, with support for a wide range of source/destination technologies and processor transformations. Supports a wide range of data formats, executors (tasks that can be triggered based on events from pipelines, e.g. to send e-mails or run a shell script), handling of erroroneous records, support for CDC CRUD records, previewing of data within the editor UI, real-time reporting and alerting on a range of execution and data quality metrics, the ability to dynamically handle changes to schemas and the semantic meaning of data and a full Python SDK. Can run in standalone mode (as a single process, with the option to run single or multi-threaded), as a Spark Straming or MapReduce job on a cluster, or in an ultralight agent (StreamSets Data Collector Edge). Java based, Open Source under the Apache 2.0 licence, hosted on GitHub, with development led by StreamSets who also provide commercial support and a number of commercial add-ons, including Control Hub (cloud service for developing and managing pipelines), Dataflow Performance Manager (for managing data metrics) and Data Protector (for managing senstive data). Started in October 2014, with a v1.0 release in September 2015. Technology Information
Vendors StreamSets Type Commercial Open Source Last Updated August 2019 - v3.10 Release History
version release date release links release comment 3.0 2017-12-15 See 3.0 notes on documentation and release page; blog post 3.1 2017-03-30 See 3.1 notes on documentation and release page 3.2 2018-05-11 See 3.2 notes on documentation and release page 3.3 2018-05-24 See 3.3 notes on documentation and release page 3.4 2018-08-10 See 3.4 notes on documentation and release page; blog post 3.5 2018-10-01 See 3.5 notes on documentation and release page; blog post 3.6 2018-11-26 See 3.6 notes on documentation and release page 3.7 2019-01-08 See 3.7 notes on documentation and release page 3.8 2019-03-14 See 3.8 notes on documentation and release page 3.9 2019-06-06 See 3.9 notes on documentation and release page blog post 3.10 2019-08-01 See 3.10 notes on documentation and release page blog post Links
News