Search time field extractions of structured data i...

kiril123 · ‎05-05-2018

Hello,

I am indexing data which arrives to the index in csv format.
I am using a search time filed extraction method. I have specified a list of the fields in the transforms.conf
What will happen in a new column gets added to a csv file or the order of columns changes? I can change a transforms.conf file by modifying the fields list, but the new transform would not work for the csv data before column order has changed.

What is the best method for csv files fields extraction assuming the order of columns can change in the future?

Thank you.

woodcock · ‎05-05-2018

The best that you can do is WATCH for it, then fix it. Here is what you do. In every CSV RegEx, Add (?:,(?<FIXME_EXPANSION>[^,]+))?. Then have a search with FIXME_EXPANSION=* that runs all the time and emails you if the results are ever non-zero.

xpac · ‎05-05-2018

For CSV-like data, DELIMS work pretty well. Take a look at this for example:
https://www.splunk.com/blog/2013/03/11/quick-n-dirty-delimited-data-sourcetypes-and-you.html

However, if your data changes its format, that might be problematic. If the new column gets appended last, it might work just defining more fields in your transforms.
Basically, when your data changes its format, you should ingest it with a different custom sourcetype that fits your data. 😉

kiril123 · ‎05-05-2018

Thank you for your answer. If I modify the sourcetype to fit the new data format then i won't be able to search the data in previous format properly. Unless i can apply multiple sourcetypes depending on the time range the data is stored for.

Search time field extractions of structured data in csv format

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Announcing Modern Navigation: A New Era of Splunk User Experience

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Join the Conversation