Hi,
I need Splunk to index data on software distribution logs. Logs are created from data gathered from few sources by a shell script. One log is created for one day.
Log name example: GCKD-20110304.csv
Log name convention: GCKD-yyyymmdd.csv
Log content example: 1722383;winxp;MS10-034;xx-x-xxxxxxx;SUCCESSFUL;2011.03.04
Log content convention: DistroID;OS;patch;EndPoint;State;Date;Time
DistroID - 7-digit distribution ID
OS - for which type of Windows is the patch specified (two values: winxp or win7)
patch - name of M$ patch (MSXX-XXX)
EndPoint - receiving machine - 15 characters
State - distribution state: SUCCESSFUL; FAILED; EXPIRED; etc
Date - yyyy.mm.dd format
Time - hh:mm:ss
Can anyone tell me how to configure Splunk to use distribution time and date as correct timestamps. And that source host is the EndPoint name? And how can I configure input to have more advanced reporting capabilities (like teaching Splunk the names of each csv field to build good looking reports)
First you will want to make sure you assign a sourcetype to these logs. In your inputs.conf add sourcetype=distribution_log
for example.
Next, in props.conf and transforms.conf, set up your field extractions as well as your host configuration.
Let's do the CSV fields first: transforms.conf:
[extract-distribution-fields]
DELIMS = ";"
FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"
And apply the extraction in props.conf:
[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields
To replace the host value with EndPoint, in transforms.conf:
[extract-distribution-host]
DEST_KEY = MetaData:Host
REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
FORMAT = host::$1
and again apply the extraction in props.conf:
[distribution_log]
TRANSFORMS-extract-host = extract-distribution-host
So all together your config files could look like this: transforms.conf:
[extract-distribution-fields]
DELIMS = ";"
FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"
[extract-distribution-host]
DEST_KEY = MetaData:Host
REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
FORMAT = host::$1
and props.conf:
[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields
TRANSFORMS-extract-host = extract-distribution-host
The Time is configured within the /opt/splunk/etc/apps/<>/local and the file props.conf, you may have to create this file and choose your stanza and date time methods.
Its all explained here http://www.splunk.com/base/Documentation/latest/admin/propsconf if you search or scroll to "Timestamp extraction configuration"
To teach splunk to recognise the files and to pull the information through in a report you should look at field extraction as you can write a regex that names each field based on the delimeter being a semicolon and the choose the field number. Fortunately Splunk will also do this for you if you use the extract field wizard.
Hope this helps.