topic Re: Indexed extractions and data filtering in Getting Data In

Indexed extractions and data filtering

PickleRick — Tue, 04 Jan 2022 13:47:18 GMT

I'm getting a bit confused about onboarding "csv" files.

The files are _mostly_ csv - they have a header with field names, they have comma-delimited fields, but they also have a kind of a footer consisting of a line full of dashes followed by a line with "Total: number" in it.

With "normal" input I'd just set a normal props/transform on HF which would route those lines into nullqueue and be done with it. I'm not sure though how it works with indexed extractions after reading https://docs.splunk.com/Documentation/Splunk/8.2.4/Data/Extractfieldsfromfileswithstructureddata#Caveats_to_extracting_fields_from_structured_data_files

Can I simply do transforms for my sourcetype just as with any other sourcetype?

And the other question is - the props.conf that I generated from my stand-alone instance that seems to parse the file properly looks like this:

[ mycsv ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
disabled=false
pulldown_type=true
TIME_FORMAT=%s
TIMESTAMP_FIELDS=Time
HEADER_FIELD_LINE_NUMBER=1

But in the production environment the file will be read by UF, then the data will be sent to HF and then to the indexers. Do I put all those settings into props.conf on UF or HF? Or do I split them between those two?

I must admit that this whole indexed extraction thing is tricky and IMHO not described well enough.

Re: Indexed extractions and data filtering

richgalloway — Tue, 04 Jan 2022 15:54:04 GMT

Any time you find Splunk's documentation to be lacking, submit feedback on that page. The Docs team is great about updating the pages in response to user feedback.

When in doubt, put props.conf files on each instance in the data path: UF, HF, Indexer, and SH. Each will use what they need and ignore the rest. In this case, however, only the HF needs those settings as that is where parsing is done.

Re: Indexed extractions and data filtering

isoutamo — Tue, 04 Jan 2022 16:00:21 GMT

You could try to figure out correct places from these docs

with these you should manage it correctly in most cases 😉

r. Ismo

Re: Indexed extractions and data filtering

PickleRick — Wed, 12 Jan 2022 07:31:28 GMT

Indeed, I went the easy way and pushed the config to both UF and HFs so regardless of when the settings should work, they do 🙂