Getting Data In

Indexed extractions and data filtering

PickleRick
Ultra Champion

I'm getting a bit confused about onboarding "csv" files.

The files are _mostly_ csv - they have a header with field names, they have comma-delimited fields, but they also have a kind of a footer consisting of a line full of dashes followed by a line with "Total: number" in it.

With "normal" input I'd just set a normal props/transform on HF which would route those lines into nullqueue and be done with it. I'm not sure though how it works with indexed extractions after reading https://docs.splunk.com/Documentation/Splunk/8.2.4/Data/Extractfieldsfromfileswithstructureddata#Cav...

Can I simply do transforms for my sourcetype just as with any other sourcetype?

And the other question is - the props.conf that I generated from my stand-alone instance that seems to parse the file properly looks like this:

[ mycsv ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
disabled=false
pulldown_type=true
TIME_FORMAT=%s
TIMESTAMP_FIELDS=Time
HEADER_FIELD_LINE_NUMBER=1

 But in the production environment the file will be read by UF, then the data will be sent to HF and then to the indexers. Do I put all those settings into props.conf on UF or HF? Or do I split them between those two?

I must admit that this whole indexed extraction thing is tricky and IMHO not described well enough.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Any time you find Splunk's documentation to be lacking, submit feedback on that page.  The Docs team is great about updating the pages in response to user feedback.

When in doubt, put props.conf files on each instance in the data path: UF, HF, Indexer, and SH.  Each will use what they need and ignore the rest.  In this case, however, only the HF needs those settings as that is where parsing is done.

---
If this reply helps you, Karma would be appreciated.

PickleRick
Ultra Champion

Indeed, I went the easy way and pushed the config to both UF and HFs so regardless of when the settings should work, they do 🙂

0 Karma

isoutamo
SplunkTrust
SplunkTrust

You could try to figure out correct places from these docs 

with these you should manage it correctly in most cases 😉

r. Ismo

Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...