Getting Data In

Carriage return newline (\r\n) not working as delimiter for makemv

jmartinf5
Engager

I am trying to break a field (httpRequest), into a multivalue field and then extract the value of one of the values.

My search:

* | makemv delim="\r\n" httpRequest | eval userAgent=mvindex(httpRequest,1)  | table  clientIp  userAgent

Nothing shows up in the table for the userAgent field. But if I change the index number to 0 instead of 1, the entire httpRequest field value shows up as the value of userAgent.

It does not appear that makemv is honoring the "\r\n" as the delimiter. I have tried escaping the backslashes with "\r\n" but the result is the same.

Further info...

The raw field looks like this:
"httpRequest":"GET / HTTP/1.1\r\nHost: somehost\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36\r\nContent-Length: 0\r\n\r\n"

And this field in the parsed json-formatted log looks like this:
httpRequest: GET / HTTP/1.1
Host: somehost
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Content-Length: 0

When I show the httpRequest field in a table it shows up like this:
GET / HTTP/1.1
Host: somehost
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
Content-Length: 0

0 Karma

jmartinf5
Engager

Also, I would still like to know why "\r\n" is not a valid delimiter in this case.

I have documented (from using Splunk years ago) a nearly identical search string that worked just fine using "\r\n" as the delimiter for makemv.

Perhaps it has something to do with the fact that it is a JSON format log????

0 Karma

Sukisen1981
Champion

hi @jmartinf5
I believe that's the way splunk works as of now. Say, for example someone wants to split by the text (or extract something) that involves r and n , most people would write something like this rex field=whatever...\r\n
This will return an extract before r and n is reached in a string like this blah blah blah2233 r n.
To out another perspective , to escape backslash in splunk regex you have to use 3 backslashes
\\ , why?
if we use just \ then there is no way to differentiate between this and the backslashes in \d+.+\w
so, we just put \ and it works right? wrong, some special chars like this needs to be 'escaped' so we need another additional backslash
\\ is interpreted thus - first one - pattern separator, common to all rexes. second one to 'escape' and eventually the third one is for the literal char .
Another example is to escape quotes for example I can not do
| makeresults
| eval x=""some text""
this will give an error. i need to 'escape' the quotes so this works:
| makeresults
| eval x="\"some text\""
and the output keeps the quotes , the output will be "Some text"
It is a bit confusing, I agree but just takes some getting-used-to

0 Karma

Sukisen1981
Champion

try this

|  rex mode=sed field=_raw "s/\\\r\\\n/*/g"
| makemv delim="*",_raw
0 Karma

somesoni2
Revered Legend

Give this a try

* | rex field=_raw mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent

jwilk
Engager

I know this is a fairly old thread but you can also just use an actual linebreak in the search...

| makemv field=field delim="
"

jmartinf5
Engager

That didn't work. Still ended up with the same result. I think this is because the rex was on the _raw log and the makemv was on the parsed field. So I changed it a bit and got it to work.

* | rex field=httpRequest mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent

However, the results then showed me that the User-Agent header isn't always the "1" index header. So I used mvfind to get the index of the UA header.

* | rex field=httpRequest mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval n=mvfind(httpRequest,"[Uu]ser-[Aa]gent") | eval userAgent=mvindex(httpRequest,n) | table clientIp userAgent

0 Karma

somesoni2
Revered Legend

Lets take a different direction and see if this works for you.

* |  rex field=httpRequest "(?<userAgent>[Uu]ser-[Aa]gent:[^\r\n]*)Content-Length:" | table clientIp userAgent
0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...