All Apps and Add-ons

Splunk Add-on for Amazon Web Services: Why are events from CloudWatch Log inputs not streaming into a real-time search

cfactaylor
Explorer

I am running the Splunk Add-On for AWS, now at version 4.0.0 as of tonight. I'm mostly interested in CloudWatch Logs events. I understand that each input has a polling interval. I've set my interval to 60 seconds for a sample log group. When I run a search with that log group as the source for a 5-minute window, the initial results come from the indexed events, usually 30-45 seconds old. No new events stream in to my search. After 5 minutes it is totally empty. If I refresh it, it shows a set of events that should have qualified for the real-time search, and they age out again.

I can run real-time searches against other sources, so I don't see it being an issue of insufficient permissions for my role. I can't find any documentation that indicates these sources wouldn't be visible to real-time searches. Am I doing something wrong, or is this a limitation of the add-on's design?

1 Solution

cfactaylor
Explorer

We now use the Splunk HTTP Event Collector (HEC). Even a single-node HEC configuration can ingest more log detail than the rate-limited Splunk Add-On for AWS. Events coming in via the HEC are also visible in real-time queries. Ultimately we configured an auto-scaling group of HEC nodes in our AWS account. Our event pipeline starts with a CloudWatch Logs subscription that sends each log group of interest to a single Kinesis Stream. From there an AWS Lambda (in Python) consumes the Kinesis stream and passes events to the HEC.

We've had to manually bump up our Kinesis Stream shard count for our single stream to handle some volume spikes. We will probably switch over to Kinesis Firehose eventually, which will handle autoscaling.

View solution in original post

0 Karma

cfactaylor
Explorer

We now use the Splunk HTTP Event Collector (HEC). Even a single-node HEC configuration can ingest more log detail than the rate-limited Splunk Add-On for AWS. Events coming in via the HEC are also visible in real-time queries. Ultimately we configured an auto-scaling group of HEC nodes in our AWS account. Our event pipeline starts with a CloudWatch Logs subscription that sends each log group of interest to a single Kinesis Stream. From there an AWS Lambda (in Python) consumes the Kinesis stream and passes events to the HEC.

We've had to manually bump up our Kinesis Stream shard count for our single stream to handle some volume spikes. We will probably switch over to Kinesis Firehose eventually, which will handle autoscaling.

0 Karma

williamholder
Explorer

I've noticed the same thing, and I can't find any information pointing me to a reason for it.
I did manually play with some of the time related fields in the plugin config and nothing seemed to affect the delay.

I'd be curious to find out why this is happening and how to fix it (if it can be fixed) otherwise this plugin may be useless to us.

0 Karma

cfactaylor
Explorer

We never got an answer to the "why" question. We also uncovered another limitation: this version of the Add-On uses a method for ingesting CloudWatch Logs that is constrained by a hard rate limit by AWS. AWS limits each account to 10 CWL log detail requests per second through the CWL API, each of which can return no more than 1MB of data. That is a 10MB/s aggregate limit on CWL events through the add-on. Both of these limits were significant to us, and we ultimately abandoned the Add-On.

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...