Getting Data In

Can I parse some data on one heavy forwarder and route to another heavy forwarder for filtering to split CPU load?

Communicator

Hi,

My Heavy Forwarder filters data at host level and sends it to Indexer. But it is using high CPU. Can I split the patterns in two levels so that few patterns are applied at the host (HF) and I add one more layer of HF to further filter out the data sent by the host and then index the data?

Thanks,
Meenal

0 Karma

Path Finder

Hi,

You could filter some data in Heavy Forwarder1 and then send the filtered output to the intermediate Heavy Forwarder and add the "route" in inputs of the intermediate HF to re-parse and then you could further filter the data.

[splunktcp://11111]
route=has_key:_utf8:parsingQueue;has_key:_linebreaker:parsingQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;

The "route" will cause re-parsing in Intermediate HF and thereby allow you to filter in Intermediate HF.

Communicator

Adding more information:

I have a setup with 30 production hosts, 1 intermediate heavy forwarder and 1 indexer. Initially i had Universal forwarders on the hosts and the events were sent to the heavy forwarder for filtering. Now, the client wants filtering to be done at the host level itself since we don't want to send unwanted data over the network.

But with Heavy forwarders on the host, there is a chance that the CPU usage will shoot up. So they proposed a solution that we filter out say 50% data on the host using heavy forwarders, and the rest of the data is filtered out at intermediate layer. (Total 80% data is filtered out).

The only problem is, I am not sure if cooked data sent by the HeavyForwarder host can again be read and filtered by intermediate heavy forwarder for second-level filtering.

The filtering is the main use-case for us.

Hope this can help someone help me 🙂

Thanks
Meenal

0 Karma

SplunkTrust
SplunkTrust

Hi, take a look at this answer http://answers.splunk.com/answers/168491/routing-data-to-index-using-sourcetype.html#comment-168793 it is possible BUT also be aware of the comment made by @jrodman !

Why don't you take a different approach; instead filter out unwanted stuff, only pick up needed stuff?

Communicator

Thanks for the reply.
I only know REGEX to ignore, not the ones to select 😞 and the NOT REGEX approach becomes expensive for processing.

0 Karma

SplunkTrust
SplunkTrust

Think about the logs itself: is it possible to change the logs in a matter that only needed stuff is in there?

0 Karma

Communicator

Hi,

Do we know in what version splunk will shut down the option of re-parsing ? I can see its available till 6.2

The above problem posts a threat to my project. CPU is 100% utilized with Heavy Forwarder on the hosts. And Universal Forwarder eats up the netwok. I really need a middle way.

Thanks,
Meenal

0 Karma

Path Finder

Hi,

You could filter some data in Heavy Forwarder1 and then send the filtered output to the intermediate Heavy Forwarder and add the "route" in inputs of the intermediate HF to re-parse and then you could further filter the data.

[splunktcp://11111]
route=has_key:_utf8:parsingQueue;has_key:_linebreaker:parsingQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;

The "route" will cause re-parsing in Intermediate HF and thereby allow you to filter in Intermediate HF.

0 Karma

Contributor

There's are HUGE caveats to the key routing method to re-parse data:

a) unsupported (don't bother opening a case about it)
b) untested (e.g. it has worked in the past but is not QA'd so could break at any point)
c) applies to ALL inputs on the system you set it up on -- regardless of the stanza it's applied in, so you are essentially relegating an IF to a single input.

0 Karma