Getting Data In

Field extraction from filename with sourcetpye csv during index time

tfechner
Path Finder

HI,

I have several files on a server loooking like: d-*_t-*.csv e.g. d-edu_t-names.csv
The csv file is a normal csv file with many columns.

We have a universal forwarder installed on this server.

My inputs.conf shows:

[monitor:///opt/log/d-*_t-*.csv]
sourcetype=csv
index=tmp
connection_host=dns

Now I must have the field d-(.+) and t-(.+) during index! time put into new fields in the index. This is due to having many more of these servers logging with the same mechanism.
Next we need a special field extraction and naming of the csv-columns. Therefore I like to enter some extractions in transforms.conf. But how can I use this as we use the sourcetype=csv and this is a global stanza?

Which is the best solution for this with less impact in indexer? and... How do I put the server-name into a field and not the source = filename)?

0 Karma

somesoni2
Revered Legend

The transforms can be created for source as well. so you can create one stanza for your source [source::/opt/log/d-*_t-*.cs] and add your transforms under that. If you want Indexed time field extraction (read para #4 of first section of this page before deciding index-time vs search time), it should be setup on your indexer, else on Search Head.

0 Karma

tfechner
Path Finder

ok - made some investigations and the extraction could bemade at searchtime.
I found a soultion to extract field from file names but is it better to use the props.conf oder transforms.conf extraction method?

in transforms I would use.
SOURCE_KEY = source
REGEX = d-(.+)_t-(.+).csv
FORMAT=field1:$1 field2:$2

or in props.conf:
EXTRACT-sourcefields = d-(.+)_t-(.+).csv in source

0 Karma

niketn
Legend

@tfechner it is a matter of decision whether you want field extraction while indexing (which will put more load on indexer/Heavy Forwarder, during index time) or while searching (which puts load on Search Head). Since the field is being extracted from source, it seems to be metadata kind of feel so I would say you can use transforms.conf along with props.conf for index time field extraction. But do check the performance whether your regex is putting too much load while indexing and delaying the index rate or not.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tfechner
Path Finder

i do notneed index time fields any more - got the missing information. so search time is enough. So I start with props/transforms. and: correction. the source_key should be "SOURCE_KEY = MetaData:Source"

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...