Process of Indexed Extraction Configuration

mrstrozy · ‎03-05-2019

Hi All!

I'm currently running into a very weird situation with a Splunk instance I inherited. I setup the props.conf through the UI on my dev instance by indexing a small number of events and then using the UI to parse through the data, creating the props.conf. I should mention that my dev instance is a single host.

I then transferred the props.conf to our test environment which consists of 1 forwarder, 2 indexers (in a "fake" cluster since less than 3 indexers), 1 master, and 3 search heads in a search head cluster. Just like my dev instance, the test instance worked properly as the fields were showing up successfully when searching on a search head.

Finally, I transferred this same props.conf to the prod instance which consists of 3 forwarders, 4 indexers in an index cluster, 1 master and 5 search heads in a search head cluster. In this environment, none of the fields get properly extracted like they were in the test/dev instances but the events are still being parsed correctly as JSON. The current fields back I'm getting are these:

I've exhausted everything I know about how the configuration/field extraction is determined and I still can't figure it out. I'm sure there's something I'm missing, and given that it's an instance that I've inherited I figured I'd post something here to see what this wonderful community could come up with. Here is a snippet from my props.conf which is pretty much how most of the sourcetypes are configured:

This props.conf lives only on the indexers (as far as I know) and I didn't find any other props.conf files on the search heads (in $SPLUNK_HOME/etc/system/local).

Any help is greatly appreciated.

Thanks!

woodcock · ‎03-05-2019

You have not deployed the props.conf configurations to the correct place. Unlike every other indexing-related configuration which should be deployed on the first full-instance of Splunk that receives it (either the HF or Indexer tier), the INDEXED_EXTRACTIONS configuration must be deployed to the forwarder, the server which possesses the files and and has the inputs.conf that is set to pull in the json. So send it to your forwarder tier, and restart all splunk instances there and it will work when you forward new data in.

mrstrozy · ‎03-06-2019

Also I've tried this and it did not fix my issue.

kundeng · ‎04-24-2020

Did you fix this issue? Does HEC support explicit indexed_extractions of CSV or JSON files when it is set to raw mode?

mrstrozy · ‎03-06-2019

I should've mentioned that we are using the HEC on the forwarders to transfer the data to the indexers. Does that change anything about what you suggested? I'm also still confused about how the configuration works in the test environment without the props.conf on the forwarder.

woodcock · ‎03-06-2019

If you are using HEC, then you are not using INDEXED_EXTRACTIONS. Which is it?

mrstrozy · ‎03-06-2019

Correct me if I'm wrong but I'm not sure why both have to be separate? I was under the impression it went like this:

host -- (through HEC) --> Forwarders ----> Indexers

How would this effect the extractions at all?

woodcock · ‎03-06-2019

That is an insane configuration. Typically HEC runs directly on the Indexers. If this is really your architecture, you need a reboot.

mrstrozy · ‎03-06-2019

I agree the architecture needs to be rethought but according to this article, the HEC can run on either the forwarders or the indexers so I'm not really sure what you're getting at - https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/ScaleHTTPEventCollector

mrstrozy · ‎03-06-2019

I see what you're saying. Yeah after inheriting this instance my goal was to rethink/architect everything but I was swamped with other things and have limited knowledge so I'm all for learning the best practice. You would suggest having the HEC on the indexer cluster, keeping the props.conf there and going from there?

woodcock · ‎03-06-2019

Yes, definitely. This is 100% upside (no downside).

mrstrozy · ‎03-06-2019

Awesome, thanks for your help woodcook it's greatly appreciated

woodcock · ‎03-06-2019

You have created an completely unnecessary bottleneck with your Intermediate Forwarder tier. What is worse, apparently you are writing HEC to disk there, so that you can do INDEXED_EXTRACTIONS which is nuts, because it defeats the primary benefit of HEC: diskless I/O.

tsaikumar009 · ‎03-05-2019

we need to put indexed_extractions = json on Indexer props and kv_mode = none on Search head props.

OR

try to put kv_mode=json on search head props and remove all other json related details from props

mrstrozy · ‎03-05-2019

As for the second solution, we want indexed extractions and not search head extractions which I believe that would cause.

mrstrozy · ‎03-05-2019

Why would it work correctly on my test environment (no props.conf on the search heads) then?

Process of Indexed Extraction Configuration

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Introducing ITSI 5.0: Unified Visibility and Actionable Insights

Inside Splunk Agent Observability: Understanding Agent Behavior, Tokens & Costs

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Join the Conversation

Process of Indexed Extraction Configuration

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Introducing ITSI 5.0: Unified Visibility and Actionable Insights

Inside Splunk Agent Observability: Understanding Agent Behavior, Tokens & Costs

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest