Which queue does INDEXED_EXTRACTIONS? What is the ...

TiagoTLD1 · ‎02-16-2017

Hello,

Which queue does INDEXED_EXTRACTIONS? What is the name of the key exactly? Is it parsingqueue?

Where can I find the correct syntax for has_key:queue to put in inputs.conf?

muebel · ‎02-16-2017

Hi TiagoTLD1, in addition to the links somesoni2 posted, check out the diagrams here : https://wiki.splunk.com/Community:HowIndexingWorks

INDEXED_EXTRACTIONS is a somewhat special processor that is usually done on universal forwarders to ingest structured data. This is done in the parsing queue. The slides for the 2015 conf session are here https://conf.splunk.com/session/2015/conf2015_ABath_JKerai_Splunk_SplunkClassics_HowSplunkdWorks.pdf

Very instructive as well.

The route configuration would be on the receiving end, and is usually needed if you want to "recook" or reparse the data.

As for a specific set of key and queue names, I don't believe they are currently available. The set of use cases involved usually just swaps the default route config to all parsingQueue in order to achieve the reparsing on the indexer as mentioned earlier.

Please let me know if this helps!

montgomeryam · ‎02-16-2017

From my experience and the Splunk Documentation... when you use INDEXED_EXTRACTIONS (or any other structured data), they bypass all parsing, merging, and typing pipelines and go straight to the Indexer via the index pipeline.

I've never used has_key: I so can't help there! Hopefully someone with the know-all will drop by and help out.

TiagoTLD1 · ‎02-16-2017

Thanks for the answer. I am sending data from a UF to a HF (where I want do the INDEXED_EXTRACTIONS). The thing is that data coming from the UF, when entering the HF, will skip that part of processing where the extractions would be done :S

montgomeryam · ‎02-16-2017

Yes, that is by due to it skipping all parsing queues.
If you want to read about it, this is what Splunk Docs has to say.
http://docs.splunk.com/Documentation/Splunk/6.5.2/Data/Extractfieldsfromfileswithstructureddata

The only way to get around that (not technically true), is to do the extractions on the UF directly BEFORE sending it to the HF. That is what I do for issues like that. As long as the server with the UF is able to do the heavy work, you shouldn't run into any problems.

It took me a while to figure out the solution to this problem initially as well, and then I read that link I posted above and it made sense. By using the Splunk INDEXED_EXTRACTIONS, you are basically using an AutoMagic button that does all of the heavy work for you. In turn to keep the pipeline as open as possible, it bypasses all other queues.

TiagoTLD1 · ‎02-16-2017

That is EXACTLY my problem, the UF must not decrease the performance of the machine where it resides. So I am receiving a tcp input (not a splunktcp) into the UF, and I wanted to keep the load of the indexed extractions in the HF only.

Is that really impossible?

montgomeryam · ‎02-16-2017

It would be by using INDEXED_EXTRACTIONS.

In order to do what you are looking for, you're probably going to have to go old school on it and setup the inputs.conf, props.conf, on the UF so that you can use transforms.conf on the Heavy Forwarder.

By going this way, you can ensure this hits the parsing queue where you can hit it on the HF.

From what I can gather, The INDEXED_EXTRACTIONS was designed and implemented to be that AutoMagic button to quickly parse structured data instead of having to manually manipulate it through props and transforms.

TiagoTLD1 · ‎02-16-2017

Yeah, the thing is that, according to diagram https://wiki.splunk.com/Community:HowIndexingWorks

The INDEXED_EXTRACTIONS is not in the parsingQueue, but instead in structuredparsingQ that is only visible there in the UF, no it the HF/IDX

montgomeryam · ‎02-16-2017

Right, and that would be why it bypasses the regular parsing queues and goes to the indexer. The diagram says something to the effect of (Even though the name is "Parsing, the pipeline in UF are not for event parsing.) I take that to mean that even though it has to piggyback on the parsingqueue pipeline, it is not being truly acted upon by any of the parsing queues. Notice in that same diagram, that INDEXED_EXTRACTIONS are absent from the Indexer's queuing actions, that would indicate that it is simply moving through those queues untouched and un-acted upon effectively bypassing them.

So what I would do...

Setup your inputs.conf to watch your tcp input and send it to a sourcetype that you specify.

inputs.conf
[default]
host = EX

[tcp://<remote server>:<port>]
disabled = false
sourcetype = EX-1
queue = parsingQueue
index = EX-index

Then on the HF...

Setup props.conf:

[EX-1]
TRANSFORMS-sourcetyping_testlog = test_log

Setup transforms.conf:

[test_log]
 DEST_KEY = MetaData:Sourcetype
 REGEX = (askfjklsadf)
 FORMAT = (akjfdklsajf)

Something like that... hopefully that is enough to give you a gist of what I'm trying to say.

somesoni2 · ‎02-16-2017

See these
http://docs.splunk.com/Documentation/Splunk/6.5.2/Admin/Configurationparametersandthedatapipeline#In...

https://answers.splunk.com/answers/5528/forwarding-select-data-in-my-environment.html

Which queue does INDEXED_EXTRACTIONS? What is the name of the key exactly?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Splunk Community Badges!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

Join the Conversation