Splunk Search
Highlighted

Index Time field extraction

Engager

I have a custom log file with entries like the one below, I want to pull 8 fields out at index time so I can graph and chart of them.

wdSiteData.busy: false wdSiteData.needUpdate: false wdSiteData.requestType: -1 wdSiteData.state: UT wdSiteData.country: USA wdSiteData.district: SOME DISTRICT wdSiteData.availableUpdates: [SPUpdate421107from96.jar, SPUpdate421108.jar, SPUpdate42195.jar, SPUpdate42196.jar, SPUpdate43077from421108.jar, SPUpdate43078.jar, SPUpdate43084from78.jar, SPUpdate44064from43084.jar] wdSiteData.peerList: null wdSiteData.checksumJar: null wdSiteData.checksumInstall: null wdSiteData.partialDownloadBytes: 0 wdSiteData.filesize: 0 wdSiteData.siteVersion: 7.8.9.10 wdSiteData.versionFrom: null wdSiteData.versionTo: null wdSiteData.timestamp: null wdSiteData.downloadUrl: null wdSiteData.school: -1 wdSiteData.filename: null wdSiteData.updateAvailable: false wdSiteData.clientAddress: 10.10.10.10 wdSiteData.guid: {4445454b1e-805a-11de-8896-fdfdfdfd743c1a} wdSiteData.maximumPeerConnections: 0

I have added in my transforms.conf /opt/splunk/etc/system/default/transforms.conf (regex and format are single lines)
I have tested the regex and it does find the fields I want correctly

[WSM-CONNTECTIONS-SiteData]
REGEX = wdSiteData.(state|country|district|siteVersion|timestamp|school|clientAddress|maximumPeerConnections):
FORMAT = WSM-timestamp::"$5" district::"$3" school::"$6" state::"$1" country::"$2" version::"$4" ipaddress::"$7" peerconnections::"$8"
WRITE_META = [true]

I have added in my props.conf /opt/splunk/etc/system/default/props.conf
[host::$IPOFHOST]
TRANSFORMS-WSM = WSM-CONNTECTIONS-SiteData

I have added in my fields.conf /opt/splunk/etc/system/default/fields.conf

[WSM-timestamp]
INDEXED = True

[district]
INDEXED = True

[school]
INDEXED = True

[state]
INDEXED = True

[country]
INDEXED = True

[version]
INDEXED = True

[ipaddress]
INDEXED = True

[peerconnections]
INDEXED = True

Tags (2)
Highlighted

Re: Index Time field extraction

Legend

Why use index-time field extraction? Is there a specific reason for doing so? Index-time field extraction should only be done if there's a really good reason for it, and only if you really know what you're doing. It has a negative impact on performance and often causes increased complexity.

0 Karma
Highlighted

Re: Index Time field extraction

Legend

By the way you don't actually say what the problem you're having is...?

0 Karma
Highlighted

Re: Index Time field extraction

Engager

The problems is I cannot see the fields in the manager. I am just learning and reading and I just want the fields to always be available for stats and charts. Other than that, please show me a better way!

Dave

0 Karma
Highlighted

Re: Index Time field extraction

Legend

Generally, always use search-time field extractions. The docs have plenty of information on this that should get you going. Here's a good place to start: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Addfieldsatsearchtime

0 Karma
Highlighted

Re: Index Time field extraction

Influencer

As Ayn says, there's no need to make these fields part of your index, using search time extractions is the right way to go 99% of the time.

Also, putting customisations in default/transforms.conf , default/props.conf and default/fields.conf is a bad idea, these files will get overwritten when you patch / upgrade

You should make files in etc/system/local called props.conf and transforms.conf and put any customisations you've made in there.
You should also remove the customisations you made to default/fields.conf - you don't need them for search time extraction.

This is what you need to do search time extractions for all the fields in your Site Data events.

props.conf:

[host::$IP_OF_HOST]
REPORT-WSM = WSM_CONNTECTIONS_SiteData

transforms.conf

[WSM_CONNTECTIONS_SiteData]
REGEX = wdSiteData\.([^:]+):\s+(.*?)(?=(?:\s+wdSiteData|$))
FORMAT = $1::$2

in the search bar run :

| extract reload=t

then

wdSiteData

You should see a bunch of interesting fields in the side bar

View solution in original post

Highlighted

Re: Index Time field extraction

Champion

and if you're running 4.3+ you don't even need to do an extract reload=t, search time extractions should be reloaded each time Splunkd forks off a new process for a search.

0 Karma
Highlighted

Re: Index Time field extraction

Legend

I have a close problem: I have to extract fields at index time to accelerate my searches (I have millions of events with 72 fields in each one) and a people from Splunk suggested to me to extract fields at index time to have a quicker search.
When you say " ...a negative impact on performance..." you are speaking about indexing performance or searching performance?
thank you.
Giuseppe

0 Karma