Getting Data In

how do I get auto field detection on forwarded csv?

bmgilmore
Path Finder

I set up a splunk instance on a server with a local csv file that updates 1/min. Using the add data wizard, it auto detected all the appropriate timestamp, metadata and value fields. I then set splunk to forward to another instance (to test forwarding), and the data forwards fine, but its all in raw format. I looked for a props.conf file on the original server to see if the wizard created something I could copy over, but no luck.

Also, if you can help with setting this up on the reciever instance, can you also mention if there is a way to go through all the data that has already been indexed and extract the fields into the indexes?

Sorry, totally new to splunk, just trying to build a business case and do some DD before strapping to it as a platform!

Tags (3)
0 Karma

lguinn2
Legend

Good news - field extraction is done at search time. This means that you can create fields for data that has already been indexed.

If you selected csv as your sourcetype (under More Settings in Manager » Data inputs » Files & directories » Add new), then Splunk would be doing the field extraction for you. But since you are forwarding the data, you didn't set that up on the forwarder.

Option 1 - Set sourcetype to csv

Here is one way to do this. This technique has you set the sourcetype of the input to csv manually as the data is collected, on the forwarder. Edit the inputs.conf to tell Splunk the correct sourcetype of the input file. I am using the filename "example.csv" here

inputs.conf

[monitor::///mydirpath/example.log]
sourcetype=csv

Important - this will only affect new data. It will not change the sourcetype of data that has already been indexed.

Option 2 - Set field extraction for a sourcetype

And here is another way. This technique assumes that the data has already been indexed, and has been assigned a sourcetype that is not csv. Let's say that you have two csv files: one of the files has sourcetype X and the second file has sourcetype Y.
Create an entry in props.conf and transforms.conf for each type of csv file.

You may need to create the props.conf and transforms.conf files. Put them under $SPLUNK_HOME/splunk/etc/system/local ($SPLUNK_HOME is wherever you installed Splunk).

props.conf

[X]
SHOULD_LINEMERGE = false
TRANSFORMS-t01 = csv1-fieldextraction

[Y]
SHOULD_LINEMERGE = false
TRANSFORMS-t02 = csv2-fieldextraction

transforms.conf

[csv1-fieldextraction]
DELIMS=","
FIELDS="User","UID","Session#","CPU","Memory","Status"

[csv2-fieldextraction]
DELIMS=","
FIELDS="PID","PPID","UID","CPU","Memory","CMD"

Now, as you add more data to splunk, you can continue to use sourcetype X and sourcetype Y, or create new sourcetypes as needed.

helge
Builder

Shouldn't REPORT be used instead of TRANSFORMS so search-time extractions are used instead of index-time extractions?

sideview
SplunkTrust
SplunkTrust

Right. The csv sourcetype is configured to use CHECK_FOR_HEADER, and that type of configuration generates AutoHeader config that ends up in '$SPLUNK_HOME/etc/apps/learned', and ends up trapped on the forwarder. So while the data itself gets forwarded, and mod the weird "foo-2" thing that CHECK_FOR_HEADER does to it's sourcetypes, arguably the sourcetypes come across, the field extractions do not come across to the indexer.

bmgilmore
Path Finder

Thanks. I was most interested in having this work from scratch, so I uninstalled splunk on both servers. Set up the first server again, used the wizard with Preview to add the file. Despite making sure both on the first screen and on more options that the sourcetype was csv, when the data source was saved, it assigned a sourcetype of csv-2. Set the primary up to forward to the newly reinstalled secondary, and the data is sent to the secondary server, but it does not break out the fields like on the primary server. Same data source, datatype, I have 21 fields on pri and 17 on sec. THX!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...

SplunkTrust Application Period is Officially OPEN!

It's that time, folks! The application/nomination period for the 2026-2027 SplunkTrust is officially open. If ...