- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Teach Splunk 4.2 to identify positions in a CSV file
Hi,
I need Splunk to index data on software distribution logs. Logs are created from data gathered from few sources by a shell script. One log is created for one day.
Log name example: GCKD-20110304.csv
Log name convention: GCKD-yyyymmdd.csv
Log content example: 1722383;winxp;MS10-034;xx-x-xxxxxxx;SUCCESSFUL;2011.03.04
Log content convention: DistroID;OS;patch;EndPoint;State;Date;Time
DistroID - 7-digit distribution ID
OS - for which type of Windows is the patch specified (two values: winxp or win7)
patch - name of M$ patch (MSXX-XXX)
EndPoint - receiving machine - 15 characters
State - distribution state: SUCCESSFUL; FAILED; EXPIRED; etc
Date - yyyy.mm.dd format
Time - hh:mm:ss
Can anyone tell me how to configure Splunk to use distribution time and date as correct timestamps. And that source host is the EndPoint name? And how can I configure input to have more advanced reporting capabilities (like teaching Splunk the names of each csv field to build good looking reports)
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
First you will want to make sure you assign a sourcetype to these logs. In your inputs.conf add sourcetype=distribution_log
for example.
Next, in props.conf and transforms.conf, set up your field extractions as well as your host configuration.
Let's do the CSV fields first: transforms.conf:
[extract-distribution-fields]
DELIMS = ";"
FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"
And apply the extraction in props.conf:
[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields
To replace the host value with EndPoint, in transforms.conf:
[extract-distribution-host]
DEST_KEY = MetaData:Host
REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
FORMAT = host::$1
and again apply the extraction in props.conf:
[distribution_log]
TRANSFORMS-extract-host = extract-distribution-host
So all together your config files could look like this: transforms.conf:
[extract-distribution-fields]
DELIMS = ";"
FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"
[extract-distribution-host]
DEST_KEY = MetaData:Host
REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
FORMAT = host::$1
and props.conf:
[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields
TRANSFORMS-extract-host = extract-distribution-host
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Time is configured within the /opt/splunk/etc/apps/<>/local and the file props.conf, you may have to create this file and choose your stanza and date time methods.
Its all explained here http://www.splunk.com/base/Documentation/latest/admin/propsconf if you search or scroll to "Timestamp extraction configuration"
To teach splunk to recognise the files and to pull the information through in a report you should look at field extraction as you can write a regex that names each field based on the delimeter being a semicolon and the choose the field number. Fortunately Splunk will also do this for you if you use the extract field wizard.
Hope this helps.
data:image/s3,"s3://crabby-images/2f34b/2f34b8387157c32fbd6848ab5b6e4c62160b6f87" alt=""