Getting Data In

Teach Splunk 4.2 to identify positions in a CSV file

MichalZ
Engager

Hi,

I need Splunk to index data on software distribution logs. Logs are created from data gathered from few sources by a shell script. One log is created for one day.


Log name example: GCKD-20110304.csv

Log name convention: GCKD-yyyymmdd.csv


Log content example: 1722383;winxp;MS10-034;xx-x-xxxxxxx;SUCCESSFUL;2011.03.04

Log content convention: DistroID;OS;patch;EndPoint;State;Date;Time


DistroID - 7-digit distribution ID

OS - for which type of Windows is the patch specified (two values: winxp or win7)

patch - name of M$ patch (MSXX-XXX)

EndPoint - receiving machine - 15 characters

State - distribution state: SUCCESSFUL; FAILED; EXPIRED; etc

Date - yyyy.mm.dd format

Time - hh:mm:ss


Can anyone tell me how to configure Splunk to use distribution time and date as correct timestamps. And that source host is the EndPoint name? And how can I configure input to have more advanced reporting capabilities (like teaching Splunk the names of each csv field to build good looking reports)

Tags (2)

ftk
Motivator

First you will want to make sure you assign a sourcetype to these logs. In your inputs.conf add sourcetype=distribution_log for example.

Next, in props.conf and transforms.conf, set up your field extractions as well as your host configuration.

Let's do the CSV fields first: transforms.conf:

[extract-distribution-fields]
DELIMS = ";"
FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"

And apply the extraction in props.conf:

[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields

To replace the host value with EndPoint, in transforms.conf:

[extract-distribution-host]
DEST_KEY = MetaData:Host
REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
FORMAT = host::$1

and again apply the extraction in props.conf:

[distribution_log]
TRANSFORMS-extract-host = extract-distribution-host

So all together your config files could look like this: transforms.conf:

[extract-distribution-fields]
    DELIMS = ";"
    FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"

[extract-distribution-host]
    DEST_KEY = MetaData:Host
    REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
    FORMAT = host::$1

and props.conf:

[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields
TRANSFORMS-extract-host = extract-distribution-host

b4ggio
Explorer

The Time is configured within the /opt/splunk/etc/apps/<>/local and the file props.conf, you may have to create this file and choose your stanza and date time methods.

Its all explained here http://www.splunk.com/base/Documentation/latest/admin/propsconf if you search or scroll to "Timestamp extraction configuration"

To teach splunk to recognise the files and to pull the information through in a report you should look at field extraction as you can write a regex that names each field based on the delimeter being a semicolon and the choose the field number. Fortunately Splunk will also do this for you if you use the extract field wizard.

Hope this helps.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...