Skip to content

Stream Processor

Stream Processor is one of the key functional components in Epoch Collectors. Following is a brief description of each key component:

  • Collector: Collects network packets, infrastructure metrics, and custom metrics. Ships the collected packets and metrics to stream processor.
  • Stream Processor: Receives data from collectors. Processes the data into compact timeseries metrics which are then shipped to the AOC.
  • AOC: Indexes and stores time series. Provides a user interface and APIs for querying and alerting.

The collectors are installed on the hosts which are to be monitored. The stream processor can be installed alongside the collector, alongside the AOC, or as a standalone installation. This guide describes the standalone installation for stream processor.

Standalone Stream Processor

Stream processor(s) can be run as a standalone component for processing network packets and metrics at scale outside the host(s) being monitored. The standalone stream processor registers itself at the AOC. The collectors are automatically configured by the AOC to send data to the stream processor(s) installed within their subnet space.

The automatic subnet match behavior can be overridden by providing the exact address of the stream processor to which the collector should connect. See parameter EPOCH_SP_HOST_OVERRIDE at collector configuration.

Stream Processor Collector Acrhicture Different Traffic Collection Modes

Prerequisites

Resource Requirements

A machine with sufficient resources is required to run a standalone stream processor.

Recommended Minimum
vCPUs 4 (or more) 2
Memory 8 GiB (or more) 4 GiB
Disk 16 GiB (or more) 8 GiB

Ports and Firewall Rules

Inbound

The inbound ports listed below must be open. The "Your Private Subnet" source refers to the subnet where you are installing the collectors.

Port Protocol Source Description Default requirement (yes/no)
2005 TCP Your Private Subnet RPCAP control channel yes
3005 TCP Your Private Subnet RPCAP data channel yes
3005 UDP Your Private Subnet RPCAP data channel no

Outbound

The following ports need to be accessible on the AOC from the standalone stream processor.

Port Protocol Source Description Default requirement (yes/no)
443 HTTPS Public Metrics and events channel yes

Configuration

The AOC address needs to be configured on the standalone stream processor as EPOCH_AOC_HOST. For the full set of configuration parameters, refer to the collector configuration.

Installation

The standalone stream processor is part of the collectors package. To run it, set the EPOCH_ROLE variable as sp in the collector configuration. In this mode, only the stream processor runs and none of the collector processes such as the traffic-collector or epoch-dd-agent would run.

Docker

Run the command below, making sure to provide the address of your AOC installation and your organization id.

docker run -td \
       --name=epoch_sp \
       --net=host \
       --ulimit core=0 \
       -e DEPLOY_ENV="docker" \
       -e EPOCH_ROLE=sp \
       -e EPOCH_AOC_HOST=${your_epoch_host} \
       -e EPOCH_ORGANIZATION_ID=${organizationId} \
       gcr.io/nutanix-epoch/collectors:latest

Debian and RHEL

Install the collectors package with the parameter EPOCH_ROLE set as sp. Follow the instructions at these links for installing the package for debian and for rhel.

Collector Configuration

To make the collectors use standalone stream processor they must be started with EPOCH_ROLE parameter set as collector. The parameter EPOCH_SP_HOST_OVERRIDE can be provided as the address of the standalone stream processor. It is preferable to use the private ip address of the stream porcessor to prevent high bandwidth traffic over the public network.

To enable automatic subnet based load balancing between collectors and stream processors do not provide EPOCH_SP_HOST_OVERRIDE parameter to the collectors. The local IP address of the stream processor host machine is used for load balancing.

Note: The collectors still need to talk to AOC for sending infratructure metrics and getting auto-updates. Specify the AOC address via EPOCH_AOC_HOST as usual.

Bandwidth Considerations

When collectors are running on network intensive instances, additional settings are recommended. By adjusting settings such as sampling rate and compression, you can strike a balance between fidelity, network overhead, and local CPU overhead.

  • Enable Static Sampling. Even with sampling enabled, the AOC should sufficiently report error rates, latency and throughput trends. The sampling parameter is provided as a percentage of total traffic (1-100), so a sampling rate of 50 would sample 50% of the flows. By default, sampling is turned off.

  • Run the collectors in local stream processing mode. In this mode the network overhead is smaller at the cost of higher CPU overhead. the network flows are processed on the same host in this mode.