Deployment Architecture

Indexer indexing too much

jessieb_83
Path Finder

I've been fighting this for a week and just spinning in circles.

I'm building a new distributed environment in a lab to prep for live deployment.  All is RHEL 8, using Splunk 9.2.

2 indexers, 3 SH's, cluster manager, deployment manager, 2 forwarders. Everything is "working" I just need to tune it now.

The indexers are cranking out 700,000 logs per hour, and it's 90% coming off audit.log; the indexers processing the logs in and out of buckets. We have a requirement to monitor audit.log at large, but do not have a requirement for it to index what the buckets are doing.

I've been looking at different approaches to this, but I would imagine I'm not the first person to encounter this.

Would it be better to tune audit.rules from the linux side? Black list some keywords in the indexers inputs.conf? Tuning through props.conf?

Would really appreciate some advice on this one. Thanks!

 

Labels (2)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

At first you got me a bit confused since Splunk has its own internal audit logs...

But since you're talking about output from auditd, there are indeed two paths you can go:

1) Limit the source by writing audit rules so that only relevant events are logged (this can also have the nice side effect of lowering load on your audited host slightly and decreased storage needs)

2) Filter the data on the receiving end by props/transforms. This is a viable solution if you're gathering the audit logs in another place as well and want to limit only what is indexed in Splunk or if you cannot write audit rules precisely enough.

Of course the general remarks from @gcusello about the "why" side of ingesting those logs are very much relevant.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @jessieb_83 ,

in my mind you should follow a different approach:

these are the waterfal questions that you need to answer to define what to index:

  • what do I want to monitor?
  • which are the Use Cases that I want to implement?
  • Which data are mandatory for my Use Cases?

When you define your monitoring perimeter (in terms of devices and data sources) to monitor you can implement the filters on your data to index only the data that are required for your Use Cases.

If you're speaking of Security Monitoring, you could use the Spunk Security Essentials App (https://splunkbase.splunk.com/app/3435) to define your Use Cases and the mandatory data for them.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...