Batch is SaaS tooling for event driven systems.

Batch offers foundational components that are necessary for building and maintaining complex distributed systems that utilize messaging systems.

We offer:

  • Messaging system introspection (we support nearly all, popular messaging systems)

  • Automatic schema discovery (if using JSON)

  • In-depth protobuf support

  • Automatic event indexing, enabling granular search

  • Automatic archiving of events into a "data-science optimal" parquet format

  • Granular replay functionality

  • Support for nearly all messaging system tech​

Take a look at the Use Cases section to get a better idea as to what's possible with the Batch platform.

Batch is an excellent addition to your infra if you make use of the actor model.


The following components make up the Batch platform:

  1. ​Event collectors​

    1. Services that receive your event data either via our gRPC or HTTP API​

  2. ​Message bus relayers​

    1. Service/utility that collects data from your message bus and relays it to our event collectors (via gRPC)

  3. Schema inference

    1. Schema generators inspect your event data and update our internal parquet schemas and Athena schemas to facilitate long-term event storage

  4. ​Storage​

    1. We store your events forever in our search cache and in S3 in parquet format.

    2. The data in search cache is used for search; data in S3 is used for replays.

    3. You can use the parsed S3 data however you wish.

  5. Search

    1. All event data is indexed and can be searched using Lucene syntax

  6. ​Replay​

    1. Once you find the data you are looking for, you can replay that data to a destination of your choice - Kafka, Rabbit, SQS or an HTTP endpoint.

For additional insight into how Batch works, check out the Architecture doc.