I am trying to break a field (httpRequest), into a multivalue field and then extract the value of one of the values.
My search:
* | makemv delim="\r\n" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent
Nothing shows up in the table for the userAgent field. But if I change the index number to 0 instead of 1, the entire httpRequest field value shows up as the value of userAgent.
It does not appear that makemv
is honoring the "\r\n" as the delimiter. I have tried escaping the backslashes with "\r\n" but the result is the same.
Further info...
The raw field looks like this:
"httpRequest":"GET / HTTP/1.1\r\nHost: somehost\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36\r\nContent-Length: 0\r\n\r\n"
And this field in the parsed json-formatted log looks like this:
httpRequest: GET / HTTP/1.1
Host: somehost
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Content-Length: 0
When I show the httpRequest field in a table it shows up like this:
GET / HTTP/1.1
Host: somehost
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
Content-Length: 0
Also, I would still like to know why "\r\n" is not a valid delimiter in this case.
I have documented (from using Splunk years ago) a nearly identical search string that worked just fine using "\r\n" as the delimiter for makemv.
Perhaps it has something to do with the fact that it is a JSON format log????
hi @jmartinf5
I believe that's the way splunk works as of now. Say, for example someone wants to split by the text (or extract something) that involves r and n , most people would write something like this rex field=whatever...\r\n
This will return an extract before r and n is reached in a string like this blah blah blah2233 r n.
To out another perspective , to escape backslash in splunk regex you have to use 3 backslashes
\\ , why?
if we use just \ then there is no way to differentiate between this and the backslashes in \d+.+\w
so, we just put \ and it works right? wrong, some special chars like this needs to be 'escaped' so we need another additional backslash
\\ is interpreted thus - first one - pattern separator, common to all rexes. second one to 'escape' and eventually the third one is for the literal char .
Another example is to escape quotes for example I can not do
| makeresults
| eval x=""some text""
this will give an error. i need to 'escape' the quotes so this works:
| makeresults
| eval x="\"some text\""
and the output keeps the quotes , the output will be "Some text"
It is a bit confusing, I agree but just takes some getting-used-to
try this
| rex mode=sed field=_raw "s/\\\r\\\n/*/g"
| makemv delim="*",_raw
Give this a try
* | rex field=_raw mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent
I know this is a fairly old thread but you can also just use an actual linebreak in the search...
| makemv field=field delim="
"
That didn't work. Still ended up with the same result. I think this is because the rex
was on the _raw log and the makemv
was on the parsed field. So I changed it a bit and got it to work.
* | rex field=httpRequest mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent
However, the results then showed me that the User-Agent header isn't always the "1" index header. So I used mvfind
to get the index of the UA header.
* | rex field=httpRequest mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval n=mvfind(httpRequest,"[Uu]ser-[Aa]gent") | eval userAgent=mvindex(httpRequest,n) | table clientIp userAgent
Lets take a different direction and see if this works for you.
* | rex field=httpRequest "(?<userAgent>[Uu]ser-[Aa]gent:[^\r\n]*)Content-Length:" | table clientIp userAgent