Plumber Relay

This page describes the many ways you can get data into the Batch platform.

Plumber Relay

Plumber is our open source project for working with various messaging systems.

Besides offering read and write functionality, it can also be used for relaying data to Batch (which uses the gRPC API under the hood).

Relaying data using plumber is the most reliable and performant way to get data into Batch as plumber makes use of batching events which can increase your total throughput.

You can launch plumber relays in multiple ways:

Running plumber in single-relay mode via CLI
- Best for quick, one-offs
Running plumber as a docker container
- Best for ephemeral workloads
Running plumber in server mode
- Best for production

The following examples show how to run plumber in single relay mode.

For production deployments, we suggest to deploy plumber running in server mode.

plumber relay kafka \
  --address "your-kafka-address.com:9092" \
  --token YOUR-COLLECTION-TOKEN-HERE \
  --topics orders \
  --tls-skip-verify

In this example, all messages from kafka topic new_orders will be automatically sent to the collection with the specified relay token.

docker run --name plumber-rabbit -p 8080:8080 \
    -e PLUMBER_RELAY_TYPE=rabbit \
    -e PLUMBER_RELAY_TOKEN=$YOUR-BATCHSH-TOKEN-HERE \
    -e PLUMBER_RELAY_RABBIT_EXCHANGE=my_exchange \
    -e PLUMBER_RELAY_RABBIT_QUEUE=my_queue \
    -e PLUMBER_RELAY_RABBIT_ROUTING_KEY=some.routing.key \
    -e PLUMBER_RELAY_RABBIT_QUEUE_EXCLUSIVE=false \
    -e PLUMBER_RELAY_RABBIT_QUEUE_DURABLE=true \
    batchcorp/plumber \
    rabbit

In this example, all messages sent to my_exchange that match the routing key some.routing.key will be sent to my_queue .

At that point, plumber will pick up the messages and send them to Batch using the specified token.

A full suite of environment variables are provided in ENV.md for configuring plumber's relay mode.

Example of running plumber via kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: plumber-deployment
spec:
  selector:
    matchLabels:
      app: plumber
  replicas: 1
  template:
    metadata:
      labels:
        app: plumber
    spec:
      containers:
        - name: plumber
          image: batchcorp/plumber:latest
          command: ["/plumber-linux", "relay", "kafka"]
          args: ["--stats-enable"]
          ports:
            - containerPort: 9191
          env:
            - name: PLUMBER_RELAY_TOKEN
              value: "--- COLLECTION TOKEN HERE ---"
            - name: PLUMBER_RELAY_KAFKA_ADDRESS
              value: "kafka.server.com:9092"
            - name: PLUMBER_RELAY_KAFKA_TOPIC
              value: "new-orders"
            - name: PLUMBER_RELAY_KAFKA_GROUP_ID
              value: "plumber"
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"

More examples of relaying from various systems can be found in EXAMPLES.md.

When should you use this API?

Plumber is the easiest way to relay throughput heavy workloads and should be used by anyone wanting to get up and running quickly.

Throughput

plumber uses gRPC under the hood to communicate with Batch's collectors.

You should be able to comfortably reach 25K-50K messages/sec on a single plumber instance. To reach higher levels, you should run plumber in cluster server mode and launch 2+ replicas of plumber.

Make sure to use the same consumer group if relaying for backends such as Kafka or NATS.

PreviousgRPC API NextKinesis Data Firehose

Last updated 2 years ago