Splunk Search

Field Extraction Problem

asarran
Path Finder

Hey, Splunkers

I'm having issues attempting a field extraction. The field extraction with appending data is a complete string based. Example:

Signature ( Apple)
Signature (Orange)

The variable in the parenthesis are the "*" in this example and Signature is my field. I'm having issues extracting these fields, because the value in the parenthesis can be entire sentences Example:

signature (apple)
signature(i went to the store to purchase, orange, kiwi, and coconuts)

I'm guessing spunks algorithm cannot process complex strings as "*"?

Tags (1)

skoelpin
SplunkTrust
SplunkTrust

If Signature is always there with the value inside the parenthesis, then you can use a simple lookbehind to anchor into the text and grab everything inside the parenthesis

(?P<Signature>(?<=Signature\s\()\.*(?=\)))

asarran
Path Finder

I'm fairly confident you explained it correctly, I'm just not quite there yet and I apologize for any inconvenience and greatly appreciate the assistance.

I think this would be the most efficient means, due to the frequency of this particular report, but i'm still at lost in starting the process. So I have to click on fields, but normally when i extract a field I click on "Extract New Fields" and then i follow the automated process of highlighting the data with field header, but when i'm flowing through the process. I don't see an option to type that "regular expression" within spunk process.

Thank You,

0 Karma

skoelpin
SplunkTrust
SplunkTrust

No problem at all and no need to apologize

So when extracting a permanent field, you could either use the built in field extractor which is kind of crappy or you can write your own regular expression. It sounds like you've tried using the built in filed extractor. The reason I say it is crappy is because it builds a sloppy regular expression which does not work across the board. The point of a regular expression is to match patterns even though the value will vary.

If you had the following text and wanted to capture the value between the StatusCode tags, you would need to write a regular expression which will capture the values between the tags.. Also notice how the values will vary (200, Yes, This is a Status Code)

<StatusCode>200</StatusCode>
<StatusCode><Yes</StatusCode>
<StatusCode> This is a Status Code</StatusCode>

If you used the Splunk built in filed extractor then it may only capture the first value but miss all the other ones. So in my opinion, its better to write your own regular expression so you can capture 100% of the values. The way you can pick up regex is by going to www.regex101.com and practicing. It took me about a month before getting to a very skilled level.

So back to your question, after clicking Extract New Fields, you will then be asked what sourcetype you want to use if you have multiple sourcetypes, if you have 1 sourcetype then it will skip this step. If you need to use a field over multiple sourcetypes, then you will need to extract a field for each sourcetype. After this step, there will be something that says I'd prefer to write this regular expression myself.. Click this and enter in the regular expression below, then hit preview. This will let you see what values were extracted. I like to click non-matches to see what didn't match (Usually this part is blank since everything matched), I then click matches and scroll through a dozen events to make sure the right value was extracted. Then you hit save and go take a look at your new field

(?P<Signature>(?<=Signature\s\()\.*(?=\)))

asarran
Path Finder

Thank You, Skoelpin

The regular expressions initially seems extremely daunting, but that website you provided is an excellent means to understand and craft my skills. I greatly appreciate your responses. Thank You, so much for your clear cohesive explanations as well.

Thank You, asarran

skoelpin
SplunkTrust
SplunkTrust

Great, I'm glad I could help! Please accept the answer if this has been helpful

0 Karma

asarran
Path Finder

I wandering how would i implement the look around option within the field extraction for splunk. Meaning where would I implement this (?P(?<=Signature\s().*(?=)))?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

There's 2 places you can do this depending on what you want to accomplish. You can either make the field extraction permanent (Still possible to delete if you want to) by clicking Fields then Extract a Field.. You will then choose your sourcetype since the filed is relative to the sourcetype then I'd prefer to write my own regular expression then enter in your regular expression and click save. This field will always be available until you explicitly delete it..

The other way is to extract the field at search time, this is a temporary field since the field will go away after you stop the search. To do this, you will use the rex command in your search..

If you use the | rex command then you can extract the field at search time. You just need to add | rex + YOUR_REGEX

So go to your GUI and enter in index=YOURINDEX .. | rex (?<Signature>(?<=Signature\s\()\.*(?=\)))

This will extract the field and call it Signature which will be available on the left side of the screen under fields, assuming your searching in Smart/Verbose mode

http://docs.splunk.com/Documentation/Splunk/6.4.2/SearchReference/Rex

sundareshr
Legend

Try this regex

... | rex "signature\s?\((?<sign>[^\)]+)" | ...

asarran
Path Finder

I'm sorry,

I'm fairly new to splunk and I'm progressing fairly well, but i'm not able to make sense of that query? Would it be possible to link the document as well? How would I implement that meaning simply typing that into the search?

0 Karma

sundareshr
Legend

If your data in already in splunk, you would type this in your search

    index=nameoftheindexforyourdata sourcetype=nameofsourcetypeforyourdata | rex "signature\s?\((?<sign>[^\)]+)" | table _time sign
0 Karma

asarran
Path Finder

Thank You so much

0 Karma

sundareshr
Legend

@asarran If this worked, please accept/upvote the answer to close it out

0 Karma