Splunk Search

How do I use rex to extract a field that may contain an ampersand

saulverde
Path Finder

I have a non standardized field in one of the logs that we pull. I am building an inline rex string to extract the field. The string below extracts everything except for one entry that should be. That entry contains an ampersand "&".

regex: \[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w.{1,3}])\]

Data not extracted with this regex: D - External Subnets - AT&T

I have tried the following:

\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+&\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+"&"\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+\&\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+\\&\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+\\\&\w])\] - I tried this after researching some perl coding suggestions

I believe this is because the ampersand is used to repeat the previously matched pattern. I'm not sure how to escape the ampersand so it reads as a litteral value. I also haven't been able to find any reference for a specific character sequence to use in the place of the ampersand to search for it.

Thanks for any help you can offer.

1 Solution

richgalloway
SplunkTrust
SplunkTrust

This should do the job. It will catch everything between "[Site: " and "]".

"\[Site:\s*(?P<site>.*)\]"
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

This should do the job. It will catch everything between "[Site: " and "]".

"\[Site:\s*(?P<site>.*)\]"
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

saulverde
Path Finder

Thank you.

When I use that it starts pulling from the adjacent field also which is the IP so I end up with far too many unique fields
Sample data with adjacent field:
[Site: V - A - VLAN 213 - Full] [XXX.XXX.XXX.XXX]
Field values being pulled now:
V - A - VLAN 213 - Full] [xxx.xxx.xxx.xxx

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Making the quantifier less greedy should fix that.

"\[Site:\s*(?P&lt;site&gt;.*?)\]"
---
If this reply helps you, an upvote would be appreciated.

saulverde
Path Finder

That worked perfectly. Thanks. I'll look up the trailing "?" and see why that solved the problem but thank you very much for your help.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Can you supply some sample data? If not, what terminates the Site field?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

saulverde
Path Finder

The closing square bracket is the termination of the value in the log.

Here are a couple examples, like I said the field doesn't have a standardized naming convention so I did my best with the regex above which catches everything except for the value that includes the ampersand.

Sample data that I need to extract:
[Site: V - A - VLAN 213 - Full]
[Site: D - External Subnets - AT&T]

0 Karma