Splunk Search

Problem with using SOURCE_KEY

dmaislin_splunk
Splunk Employee
Splunk Employee

I have some XML data that I parse into many fields, one of which is "relativePath" why can't I get the transforms to extract a new field "fileName" from the SOURCE_KEY? The rex command works fine in the search bar:

  | rex field=relativePath "^.*[\\\/](?<fileName>.*)"

Sample Event:

<CheckEventRequest>
  <EventList count="1">
    <Event event="0x20000" path="\\cepapoc.emcsplunk.com\CHECK$\server2fs1\davidpoc2" flag="0x2" protocol="0" server="CEPAPOC" share="server2fs1" clientIP="10.0.0.2" serverIP="10.0.0.4" timeStamp="0x4EF4883C00014D1D" userSid="S-1-5-21-175151209-4036982877-1867759480-500" ownerSid="S-1-5-32-544" fileSize="0x0" newName="\\cepapoc.emcsplunk.com\CHECK$\server2fs1\SplunkEMC" desiredAccess="0x0" createDispo="0x0" ntStatus="0x0" relativePath="\\CEPAPOC\server2fs1\davidpoc2"/>
  </EventList>
</CheckEventRequest>

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-xmlkv = xmlkv-alternative
REPORT-getFileName = getFileName

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = ^.*[\\\/](.*)
FORMAT = fileName::"$1"
Tags (1)
0 Karma
1 Solution

dmaislin_splunk
Splunk Employee
Splunk Employee

The answer is.....

The data was using autokv to extract all the delimited fields, not my xmlkv-alternative. SOURCE_KEY does not work well with the default splunk autokv. I replaced it with kv-alternative.

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-parsefields = kv-alternative,getFileName
TRANSFORMS-removehb = removehb
LOOKUP-event = eventlookup event OUTPUTNEW event_description
LOOKUP-dispo = dispolookup createDispo OUTPUTNEW createDispo_Description
KV_MODE = none

transforms.conf

[kv-alternative]
REGEX = (\w+)="([^"]+)"
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = (?<fileName>[^\\]+)$

View solution in original post

0 Karma

dmaislin_splunk
Splunk Employee
Splunk Employee

The answer is.....

The data was using autokv to extract all the delimited fields, not my xmlkv-alternative. SOURCE_KEY does not work well with the default splunk autokv. I replaced it with kv-alternative.

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-parsefields = kv-alternative,getFileName
TRANSFORMS-removehb = removehb
LOOKUP-event = eventlookup event OUTPUTNEW event_description
LOOKUP-dispo = dispolookup createDispo OUTPUTNEW createDispo_Description
KV_MODE = none

transforms.conf

[kv-alternative]
REGEX = (\w+)="([^"]+)"
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = (?<fileName>[^\\]+)$
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

This is most likely related to a bad regex. Assuming relativePath="\CEPAPOC\server2fs1\davidpoc2" and that you want extract fileName=davidpoc2 then the following should do the trick (note the updated regex in getFileName)

props.conf
[cepa] 
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18 
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml 
REPORT-my_name = xmlkv-alternative, getFileName

transforms.conf
[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
# need to extract filenames from unix and windows paths, so use both forward/backward slashes
REGEX = (?<fileName>[^\\/]+)$

_d_
Splunk Employee
Splunk Employee

Perhaps could be hitting the problem described here: http://blogs.splunk.com/2011/10/07/cannot-search-based-on-an-extracted-field/ 🙂

0 Karma

_d_
Splunk Employee
Splunk Employee

See if this works:

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-my_name = xmlkv-alternative, getFileName

This particular REPORT sequence insures that the [xmlkv-alternative] transform stanza gets applied first, then [getFileName].

Hope this helps.

> please upvote and accept answer if you find it useful - thanks!

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...