Getting Data In

Indexed extractions and data filtering

PickleRick
SplunkTrust
SplunkTrust

I'm getting a bit confused about onboarding "csv" files.

The files are _mostly_ csv - they have a header with field names, they have comma-delimited fields, but they also have a kind of a footer consisting of a line full of dashes followed by a line with "Total: number" in it.

With "normal" input I'd just set a normal props/transform on HF which would route those lines into nullqueue and be done with it. I'm not sure though how it works with indexed extractions after reading https://docs.splunk.com/Documentation/Splunk/8.2.4/Data/Extractfieldsfromfileswithstructureddata#Cav...

Can I simply do transforms for my sourcetype just as with any other sourcetype?

And the other question is - the props.conf that I generated from my stand-alone instance that seems to parse the file properly looks like this:

[ mycsv ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
disabled=false
pulldown_type=true
TIME_FORMAT=%s
TIMESTAMP_FIELDS=Time
HEADER_FIELD_LINE_NUMBER=1

 But in the production environment the file will be read by UF, then the data will be sent to HF and then to the indexers. Do I put all those settings into props.conf on UF or HF? Or do I split them between those two?

I must admit that this whole indexed extraction thing is tricky and IMHO not described well enough.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Any time you find Splunk's documentation to be lacking, submit feedback on that page.  The Docs team is great about updating the pages in response to user feedback.

When in doubt, put props.conf files on each instance in the data path: UF, HF, Indexer, and SH.  Each will use what they need and ignore the rest.  In this case, however, only the HF needs those settings as that is where parsing is done.

---
If this reply helps you, Karma would be appreciated.

PickleRick
SplunkTrust
SplunkTrust

Indeed, I went the easy way and pushed the config to both UF and HFs so regardless of when the settings should work, they do 🙂

0 Karma

isoutamo
SplunkTrust
SplunkTrust

You could try to figure out correct places from these docs 

with these you should manage it correctly in most cases 😉

r. Ismo

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...