Splunk Search

Regular expressions to match a specific string for field exctraction

kstam2
New Member

I have this type of log file:

182.236.164.11 - - [04/Mar/2019:18:20:56] "GET /cart.do?action=addtocart&itemId=EST-15&productId=BS-AG-G09&JSESSIONID=SD6SL8FF10ADFF53101 HTTP 1.1" 200 2252 "http://www.buttercupgames.com/oldlink?itemId=EST-15" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.46 Safari/536.5" 506

I am trying to create a regular expression to only match the word Intel, regardless of the relative position of the string in order to create a field.

I have come up with this regular expression from the automated regex generator in splunk:

^[^;\n]*;\s+

But it doesn't always work as it will match other strings as well.

I want to match the string Intel only so as to create a field in Splunk.

I have also tried the following code as to only match the word but still to no avail:

\bIntel\(?P<CPU>\w+)

Any inputs are welcome.

0 Karma

woodcock
Esteemed Legend

Like this:

|  makeresults
| eval _raw="182.236.164.11 - - [04/Mar/2019:18:20:56] \"GET /cart.do?action=addtocart&itemId=EST-15&productId=BS-AG-G09&JSESSIONID=SD6SL8FF10ADFF53101 HTTP 1.1\" 200 2252 \"http://www.buttercupgames.com/oldlink?itemId=EST-15\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.46 Safari/536.5\" 506"
| rex "^(?:\S+\s+){4}\"[^\"]+\"\s*(?:\S+\s+){2}(?:\"[^\"]+\"\s*)\"(?<useragent>[^\"]+)"
| rename COMMENT AS "You should already have a 'useragent' field"
| rex field=useragent ";\s+(?<CPU>\S+)"
0 Karma

renjith_nair
SplunkTrust
SplunkTrust

@kstam2

If you want to literallly search for the string intel , |rex field=_raw "(?<CPU>Intel)" should work. However, that does not make much since you can just do it with eval CPU="Intel" as well if the value is always "Intel". If you are looking for finding the CPU types, then probably you should try

\(\w+;(?<CPU>\s+\w+)

If you have other OS types and different event formats, please share more samples so that the regex can be adjusted to your needs.

nickhills
Ultra Champion

If you just want to match the word 'intel' this will do it:
(?P<cpu>[iI][nN][tT][eE][lL])

If you want to pull more out of the user agent you could also use something like:
\s\((?P<platorm>\w+)\;\s(?P<arch>\w+)\s(?P<os>[^\)]+)\)

If my comment helps, please give it a thumbs up!

manuelostertag
Path Finder

If you want to make it shorter you could also use (?i) (where i means: insensitive. Case insensitive match (ignores case of [a-zA-Z])

(?i)(?<cpu>intel)

Here you could test it:
https://regex101.com/r/3pCOHf/1

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Why do you need regex ? If you want to search only Intel word in raw data then you can use below query

index=blabla sourcetype=abcxyz "Intel"
0 Karma