I am trying to create new fields to search across multiple sources. I have two problems:
Hello @ivonnepena , welcome to Splunk Answers
I answered this exact same question yesterday, I'll paste my response below and provide the link too
As for your second question, are you referring to fixing the length of your values so they look neat in the column?
when extracting a permanent field, you could either use the built in field extractor which is kind of crappy or you can write your own regular expression. It sounds like you've tried using the built in filed extractor. The reason I say it is crappy is because it builds a sloppy regular expression which does not work across the board. The point of a regular expression is to match patterns even though the value will vary.
If you had the following text and wanted to capture the value between the StatusCode tags, you would need to write a regular expression which will capture the values between the tags.. Also notice how the values will vary (200, Yes, This is a Status Code)
<StatusCode>200</StatusCode> <StatusCode><Yes</StatusCode> <StatusCode> This is a Status Code</StatusCode>
If you used the Splunk built in filed extractor then it may only capture the first value but miss all the other ones. So in my opinion, its better to write your own regular expression so you can capture 100% of the values. The way you can pick up regex is by going to
www.regex101.com and practicing. It took me about a month before getting to a very skilled level.
So back to your question, after clicking
Extract New Fields, you will then be asked what sourcetype you want to use if you have multiple sourcetypes, if you have 1 sourcetype then it will skip this step. If you need to use a field over multiple sourcetypes, then you will need to extract a field for each sourcetype. After this step, there will be something that says
I'd prefer to write this regular expression myself.. Click this and enter in the regular expression below, then hit preview. This will let you see what values were extracted. I like to click
non-matches to see what didn't match (Usually this part is blank since everything matched), I then click
matches and scroll through a dozen events to make sure the right value was extracted. Then you hit save and go take a look at your new field
Using the create new field option and regex I get:
2016-07-12 21:47:49 Kernel.Warning 22.214.171.124 Jul 13 04:42:08 EIS-BR kernel: [55214.077676] id=TAC pri=6 func=wrlog_logger line=181 ctx=bump0 msg="Unknown Identity: no enabled identity for token: 126.96.36.199:52675 -> 188.8.131.52:3389 act(DISCARD:)
"Unknown identity" is the value for my field msg_1
Then I want to add another value ("Deny udp") in msg_1:
2016-07-29 10:21:27 Local4.Warning 184.108.40.206 Jul 29 2016 15:25:16 Ent-FW : %ASA-4-106023: Deny udp src inside:220.127.116.11/514 dst outside:18.104.22.168/514 by access-group "insideaccessin" [0x0, 0x0]
But I get instead a lot of values as a match for that field which are not intended.
Please note that these two search strings are located in different columns of the event, as you can see.
The unmatched values are for example (this is intended to be a screen shot of the top values of the field msg_1. As you see Unknown identity is there, but there are many other values included that we don't want):
Top 10 Values Count %
Trusted Host insert 219,786 43.123%
Protected Resource accept 140,836 27.633%
Unknown Identity 84,839 16.646%
insideaccessin" [0x0, 0x0] 43,739 8.582%
hunsberger" 703 0.138%
2016-07-24T12 468 0.092%
2016-07-23T12 430 0.084%
outsideaccessin" [0x0, 0x0] 418 0.082%
2016-07-24T06 382 0.075%