Getting Data In

Sudden spike in indexed data. How to narrow down

Branden
Builder

The amount of data I index daily is pretty consistent for the most part. I suppose it's gradually increasing, but no big deal.

But once in a while there is a day where there is a 100% increase in the amount of data indexed.

I suspect it's one log/one forwarder that's doing it. And I have a hunch as to what log it is, but I can't demonstrate that in Splunk.

This is supposed to be easy with Splunk, but I can't figure this out. What is the best way to track which forwarder is the source of the spike? How can I narrow down the log that is causing this?

Thanks!

Tags (1)

bwooden
Splunk Employee
Splunk Employee

One way to identify this is to look at the metrics collected by Splunk. From the search app, you can click on menu option "Views" and select "Advanced Charting".

In the Formatting options choose Chart Type=line and Multi-series mode=Combined using the appropriate drop down boxes. Finally, search for

index=_internal metrics kb series!=_* "group=per_host_thruput" earliest=-7d | eval mb = kb / 1024 | timechart fixedrange=t span=1d sum(mb) by series

bwooden
Splunk Employee
Splunk Employee

If you changed "group=per_host_thruput" to "group=per_source_thruput" you may be able to discern that from the indexer if the volume is high enough -- but it will be showing all sources. It may be easier to go to the forwarder in question and run this search from the CLI: index=internal metrics kb series!=* "group=per_source_thruput" earliest=-7d | eval mb = kb / 1024 | stats sum(mb) by series | sort -sum(mb)

Branden
Builder

Thank you for the response. I was able to narrow down the host that it's coming from (wasn't the host I was expecting!) but I can't tell WHAT on the host is causing it. I am wondering if one of the log files has spun out of control or something. Is there a way to get that information?
Thanks!

Brian_Osburn
Builder

You can take a look at one of the pre-built reports. You can access it by going to the Search App, then selecting "Status" * "Index Activity" * "Index Volume".

Then using the dropdowns, you can select "Source" for the the time frame.

Hope this helps, Brian

Branden
Builder

Thank you for the response. That procedure tells me how much as indexed in GB, but it doesn't give me much more granularity than that. I'll play with it though and see what I can come up with. Thanks!

Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...