I am quite new to both Regex and Splunk. When doing a field extraction for an image, I did not like the results, so I started modifying the regular expression myself. Instead of grabbing the data I want, it is just selecting the white space directly after the data I want. for instance I am using this for my field extraction:
image\=([\w]+)(?)
I am wanting it to grab the data after "image="
And here is an example of some of the data I am looking at:
0.0.0.0 - - [12/Dec/2014:07:43:56 +1100] "GET /pdf.cfm?handle=pra&image=ImageName%20&src=Direct HTTP/1.1" 302 69 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
It is just highlighting what looks like a space that does not exist inbetween the ImageName and the '%'
Can someone please let me know where I went wrong with my expression?
Hi,
Try this rex,
| rex field=_raw "image=(?<field1>.*)&"
to extract the "image" field and use urldecode()
function to remove the url decode chars.
Sample search:
|stats count| eval _raw=" 0.0.0.0 - - [12/Dec/2014:07:43:56 +1100] \"GET /pdf.cfm?handle=pra&image=ImageName%20&src=Direct HTTP/1.1\" 302 69 \"-\" \"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)\"" | rex field=_raw "image=(?<field1>.*)&" | eval field1= urldecode(field1)
Cheeerrrsss!
Oops sorry, done now. Thanks again
Perfect! Thank you so much vasanthmss!
If its useful please accept the answer. Cheers!
Hi,
Try this rex,
| rex field=_raw "image=(?<field1>.*)&"
to extract the "image" field and use urldecode()
function to remove the url decode chars.
Sample search:
|stats count| eval _raw=" 0.0.0.0 - - [12/Dec/2014:07:43:56 +1100] \"GET /pdf.cfm?handle=pra&image=ImageName%20&src=Direct HTTP/1.1\" 302 69 \"-\" \"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)\"" | rex field=_raw "image=(?<field1>.*)&" | eval field1= urldecode(field1)
Cheeerrrsss!