Getting Data In

Can I parse some data on one heavy forwarder and route to another heavy forwarder for filtering to split CPU load?

meenal901
Communicator

Hi,

My Heavy Forwarder filters data at host level and sends it to Indexer. But it is using high CPU. Can I split the patterns in two levels so that few patterns are applied at the host (HF) and I add one more layer of HF to further filter out the data sent by the host and then index the data?

Thanks,
Meenal

0 Karma

merp96
Path Finder

Hi,

You could filter some data in Heavy Forwarder1 and then send the filtered output to the intermediate Heavy Forwarder and add the "route" in inputs of the intermediate HF to re-parse and then you could further filter the data.

[splunktcp://11111]
route=has_key:_utf8:parsingQueue;has_key:_linebreaker:parsingQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;

The "route" will cause re-parsing in Intermediate HF and thereby allow you to filter in Intermediate HF.

meenal901
Communicator

Adding more information:

I have a setup with 30 production hosts, 1 intermediate heavy forwarder and 1 indexer. Initially i had Universal forwarders on the hosts and the events were sent to the heavy forwarder for filtering. Now, the client wants filtering to be done at the host level itself since we don't want to send unwanted data over the network.

But with Heavy forwarders on the host, there is a chance that the CPU usage will shoot up. So they proposed a solution that we filter out say 50% data on the host using heavy forwarders, and the rest of the data is filtered out at intermediate layer. (Total 80% data is filtered out).

The only problem is, I am not sure if cooked data sent by the HeavyForwarder host can again be read and filtered by intermediate heavy forwarder for second-level filtering.

The filtering is the main use-case for us.

Hope this can help someone help me 🙂

Thanks
Meenal

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi, take a look at this answer http://answers.splunk.com/answers/168491/routing-data-to-index-using-sourcetype.html#comment-168793 it is possible BUT also be aware of the comment made by @jrodman !

Why don't you take a different approach; instead filter out unwanted stuff, only pick up needed stuff?

meenal901
Communicator

Thanks for the reply.
I only know REGEX to ignore, not the ones to select 😞 and the NOT REGEX approach becomes expensive for processing.

0 Karma

MuS
SplunkTrust
SplunkTrust

Think about the logs itself: is it possible to change the logs in a matter that only needed stuff is in there?

0 Karma

meenal901
Communicator

Hi,

Do we know in what version splunk will shut down the option of re-parsing ? I can see its available till 6.2

The above problem posts a threat to my project. CPU is 100% utilized with Heavy Forwarder on the hosts. And Universal Forwarder eats up the netwok. I really need a middle way.

Thanks,
Meenal

0 Karma

merp96
Path Finder

Hi,

You could filter some data in Heavy Forwarder1 and then send the filtered output to the intermediate Heavy Forwarder and add the "route" in inputs of the intermediate HF to re-parse and then you could further filter the data.

[splunktcp://11111]
route=has_key:_utf8:parsingQueue;has_key:_linebreaker:parsingQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;

The "route" will cause re-parsing in Intermediate HF and thereby allow you to filter in Intermediate HF.

0 Karma

nnmiller
SplunkTrust
SplunkTrust

There's are HUGE caveats to the key routing method to re-parse data:

a) unsupported (don't bother opening a case about it)
b) untested (e.g. it has worked in the past but is not QA'd so could break at any point)
c) applies to ALL inputs on the system you set it up on -- regardless of the stanza it's applied in, so you are essentially relegating an IF to a single input.

0 Karma
Get Updates on the Splunk Community!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...