Splunk Search

handling large amount of csv files as input and rename sourcetype as well as specify header

dominiquevocat
Motivator

I have a monitored folder on a splunk server where i place specific types of information in a subfolder where scripts place their input.

I have a folder \Bluescreen where i place an extract in form of a csv containing all the crash information of the windows client machine. Every File folows a naming convention.

I can at searchtime extract the asset number from the filename which cointains the hostname of the machine which contains the asset number of the workstation. I can use a saved search to find all i need but would prefer to have all the csv files use a fields directive in props.conf and hence need to treat all these files as the same sourcetype.

Additionally i would like to automatically supple additional information like the aforementioned "Asset" ID in order to use it in workflow actions like linking to the asset management webinterface showing me the asset management information etc.

I have modified c:\program files\splunc\etc\system\local\props.conf as follows:

[source::..._bluescreen.csv]
rename=Bluescreen
EXTRACT-Asset=(?<Asset>8[0-9, aA-zZ]{4})
fields="Dump File","Crash Time","Bug Check String","Bug Check Code","Parameter 1","Parameter 2","Parameter 3","Parameter 4","Caused By Driver","Caused By Address","File Description","Product Name","Company","File Version","Processor","Computer Name","Full Path","Processors Count","Major Version","Minor Version"

I have two issues. One the sourcetype still is csv-xxx and second i do not know how to have the EXTRACT extract from to the sourcefile field/information.

In the search i help myself with the following query thus far:

eventtype="Bluescreen" | rex field=source "(?<Asset>8[0-9, aA-zZ]{4})"

(yes, assets are 5 chars wide and start with an 8, cheapest regex to accomplish this. I am not reluctant to improve it) eventtype "Bluescreen" is defined as *_bluescreen.vsc and there it works nicely and i could define it in the web UI.

Please iluminate me as to my shortcomings and how i could reach my goal which is all files in this folder following this general form to be treated as sourcetype="Bluescreen" aloowing for automatic extraction of the field Asset.

0 Karma

Lowell
Super Champion

Looks like you just have a couple of issues in your config files. Try something like this:

props.conf

[source::..._bluescreen.csv]
sourcetype = Bluescreen

[Bluescreen]
EXTRACT-Asset=(?<Asset>8[0-9, aA-zZ]{4}) in source
FORMAT-fields = Bluescreen-csv-fields

# This part is optional.  You can rename your old sourcetype "csv-xxx" to "Bluescreen", with a small entry like this:
[csv-xxx]
rename = Bluescreen

transforms.conf

[Bluescreen-csv-fields]
DELIMS = ","
FIELDS = "Dump File","Crash Time","Bug Check String","Bug Check Code","Parameter 1","Parameter 2","Parameter 3","Parameter 4","Caused By Driver","Caused By Address","File Description","Product Name","Company","File Version","Processor","Computer Name","Full Path","Processors Count","Major Version","Minor Version"


BTW, if you would be willing to share how you generate these CSV files, I would be interested in collecting similar crash data on my Windows systems.

0 Karma

Lowell
Super Champion

Just noticed that you have spaces in your field names, you may want to remove the spaces; I think plunk will replace them with "_"s at search time; so IMHO, it's probably better to remove the spaces yourself.

0 Karma

dominiquevocat
Motivator

I use BlueScreenView.exe which allows to generate a csv report with
"BlueScreenView.exe" /scomma c:\temp\%computername%_bluescreen.csv
Which i copy to a monitored folder on the splunk machine. easy as pie.
Alas I had no luck with the indexing as sourcetype= but the files are already indexed... will try the csv-xxx thingy though. Hope the tool helps you. shure was revealing to us especially since i look up the username etc based on the asset from the filename in a csv generated in our assetmanagement system.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!