Splunk Search

Problem with using SOURCE_KEY

dmaislin_splunk
Splunk Employee
Splunk Employee

I have some XML data that I parse into many fields, one of which is "relativePath" why can't I get the transforms to extract a new field "fileName" from the SOURCE_KEY? The rex command works fine in the search bar:

  | rex field=relativePath "^.*[\\\/](?<fileName>.*)"

Sample Event:

<CheckEventRequest>
  <EventList count="1">
    <Event event="0x20000" path="\\cepapoc.emcsplunk.com\CHECK$\server2fs1\davidpoc2" flag="0x2" protocol="0" server="CEPAPOC" share="server2fs1" clientIP="10.0.0.2" serverIP="10.0.0.4" timeStamp="0x4EF4883C00014D1D" userSid="S-1-5-21-175151209-4036982877-1867759480-500" ownerSid="S-1-5-32-544" fileSize="0x0" newName="\\cepapoc.emcsplunk.com\CHECK$\server2fs1\SplunkEMC" desiredAccess="0x0" createDispo="0x0" ntStatus="0x0" relativePath="\\CEPAPOC\server2fs1\davidpoc2"/>
  </EventList>
</CheckEventRequest>

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-xmlkv = xmlkv-alternative
REPORT-getFileName = getFileName

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = ^.*[\\\/](.*)
FORMAT = fileName::"$1"
Tags (1)
0 Karma
1 Solution

dmaislin_splunk
Splunk Employee
Splunk Employee

The answer is.....

The data was using autokv to extract all the delimited fields, not my xmlkv-alternative. SOURCE_KEY does not work well with the default splunk autokv. I replaced it with kv-alternative.

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-parsefields = kv-alternative,getFileName
TRANSFORMS-removehb = removehb
LOOKUP-event = eventlookup event OUTPUTNEW event_description
LOOKUP-dispo = dispolookup createDispo OUTPUTNEW createDispo_Description
KV_MODE = none

transforms.conf

[kv-alternative]
REGEX = (\w+)="([^"]+)"
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = (?<fileName>[^\\]+)$

View solution in original post

0 Karma

dmaislin_splunk
Splunk Employee
Splunk Employee

The answer is.....

The data was using autokv to extract all the delimited fields, not my xmlkv-alternative. SOURCE_KEY does not work well with the default splunk autokv. I replaced it with kv-alternative.

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-parsefields = kv-alternative,getFileName
TRANSFORMS-removehb = removehb
LOOKUP-event = eventlookup event OUTPUTNEW event_description
LOOKUP-dispo = dispolookup createDispo OUTPUTNEW createDispo_Description
KV_MODE = none

transforms.conf

[kv-alternative]
REGEX = (\w+)="([^"]+)"
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = (?<fileName>[^\\]+)$
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

This is most likely related to a bad regex. Assuming relativePath="\CEPAPOC\server2fs1\davidpoc2" and that you want extract fileName=davidpoc2 then the following should do the trick (note the updated regex in getFileName)

props.conf
[cepa] 
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18 
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml 
REPORT-my_name = xmlkv-alternative, getFileName

transforms.conf
[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
# need to extract filenames from unix and windows paths, so use both forward/backward slashes
REGEX = (?<fileName>[^\\/]+)$

_d_
Splunk Employee
Splunk Employee

Perhaps could be hitting the problem described here: http://blogs.splunk.com/2011/10/07/cannot-search-based-on-an-extracted-field/ 🙂

0 Karma

_d_
Splunk Employee
Splunk Employee

See if this works:

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-my_name = xmlkv-alternative, getFileName

This particular REPORT sequence insures that the [xmlkv-alternative] transform stanza gets applied first, then [getFileName].

Hope this helps.

> please upvote and accept answer if you find it useful - thanks!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...