Deployment Architecture

Filtering syslog logs before indexing- What are the hardware requirements?

WonnyJack
Engager

Hello Community,

I have distributed environment with 2 indexers (each has 48 vCPU, 64gb RAM), which are ingesting 200 gb logs/day (each indexer).

I want to send to them another 200 gb  syslog logs per day (for each indexer), but I want to filter the logs before indexing. I would index only 10% of 200gb of that additional syslog logs at each indexer, so 90% would be rejected.

Could you please tell me what are hardware requirements for such setup? I couldn't find any hints.

Labels (2)
0 Karma

PickleRick
Ultra Champion

If you want to filter syslog data, I'd advise additional syslog processing layer. You most probably need additional network-level data (like source IP) which splunk cannot easily provide so you'll need to use some rsyslog or sc4s anyway. And if you're gonna be using that, filtering in the syslog layer is much more straightforward.

Usually such processing does not need much memory (unless you want to do some heavy buffering), and the hardware needed will be highly dependant on how complicated your filtering rules are. I have a 32CPU machine which does some very simple "receive, enrich and forward" syslog operations and the load is usually around 4-5 while running rsyslog for around 700-800GB/day.

On the other hand, the same amount of data on a next step of processing where there is a relatively complicated set of rules involved uses around 15-16 vcpus on the next layer. (I have three 8-cpu machines just to have some space).

gcusello
Legend

Hi @WonnyJack,

at first the number of Indexers is calculated with a max load for each Indexers of around 200 GB/day for Splunk Enterprise and around 150 GB/day if you have Splunk Enterprise Security, so if you want to add more logs it's better to add at least one additional Indexer.

Then the hardware reference depends on the logs volume (as I already described) and on the users and number of searches, if you don't have many users, probably the hardware you are using is over dimensioned for the usual needs, here you can find  some indication for hardware reference https://docs.splunk.com/Documentation/Splunk/8.2.5/Capacity/Referencehardware

Then, do you want to filter logs on Indexers or on Heavy Forwarders (in both cases before indexing!)?

If on Indexers I think that if you don't have so too many users, you could use mid range indexers and run without problems.

If instead you want to use Heavy Forwarders (I usually hint this solution to separate roles), at first you have to use at least two of them to avoid Single Point of Failure, then for HFs you can use the hardware reference of stand alone Splunk server.

Then, I suppose that you have an Indexer Cluster, but this hasn't a great impact on hardware reference.

Ciao.

Giuseppe

WonnyJack
Engager

Hello,

thank you both for answers., I discussed it internally with my Team. Unfortunately Client declined idea, because of total costs 😞

BR

0 Karma

gcusello
Legend

Hi @WonnyJack ,

maybe you should describe to the customer the advantages of the new logestion.

Then that, adding a new  indexer isn't a great cost for the advantages the you could have, and the improved security giving from the analysis of the new data.

Anyway, see next time and tell me if i can help you more or, please accept one answer for the other people of Community..

Ciao and happy splunking.

Giuseppe

P.S.: Karma Points are appreciated by all the Contributors

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...