Getting Data In

Carriage return newline (\r\n) not working as delimiter for makemv

jmartinf5
Engager

I am trying to break a field (httpRequest), into a multivalue field and then extract the value of one of the values.

My search:

* | makemv delim="\r\n" httpRequest | eval userAgent=mvindex(httpRequest,1)  | table  clientIp  userAgent

Nothing shows up in the table for the userAgent field. But if I change the index number to 0 instead of 1, the entire httpRequest field value shows up as the value of userAgent.

It does not appear that makemv is honoring the "\r\n" as the delimiter. I have tried escaping the backslashes with "\r\n" but the result is the same.

Further info...

The raw field looks like this:
"httpRequest":"GET / HTTP/1.1\r\nHost: somehost\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36\r\nContent-Length: 0\r\n\r\n"

And this field in the parsed json-formatted log looks like this:
httpRequest: GET / HTTP/1.1
Host: somehost
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Content-Length: 0

When I show the httpRequest field in a table it shows up like this:
GET / HTTP/1.1
Host: somehost
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
Content-Length: 0

0 Karma

jmartinf5
Engager

Also, I would still like to know why "\r\n" is not a valid delimiter in this case.

I have documented (from using Splunk years ago) a nearly identical search string that worked just fine using "\r\n" as the delimiter for makemv.

Perhaps it has something to do with the fact that it is a JSON format log????

0 Karma

Sukisen1981
Champion

hi @jmartinf5
I believe that's the way splunk works as of now. Say, for example someone wants to split by the text (or extract something) that involves r and n , most people would write something like this rex field=whatever...\r\n
This will return an extract before r and n is reached in a string like this blah blah blah2233 r n.
To out another perspective , to escape backslash in splunk regex you have to use 3 backslashes
\\ , why?
if we use just \ then there is no way to differentiate between this and the backslashes in \d+.+\w
so, we just put \ and it works right? wrong, some special chars like this needs to be 'escaped' so we need another additional backslash
\\ is interpreted thus - first one - pattern separator, common to all rexes. second one to 'escape' and eventually the third one is for the literal char .
Another example is to escape quotes for example I can not do
| makeresults
| eval x=""some text""
this will give an error. i need to 'escape' the quotes so this works:
| makeresults
| eval x="\"some text\""
and the output keeps the quotes , the output will be "Some text"
It is a bit confusing, I agree but just takes some getting-used-to

0 Karma

Sukisen1981
Champion

try this

|  rex mode=sed field=_raw "s/\\\r\\\n/*/g"
| makemv delim="*",_raw
0 Karma

somesoni2
Revered Legend

Give this a try

* | rex field=_raw mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent

jwilk
Engager

I know this is a fairly old thread but you can also just use an actual linebreak in the search...

| makemv field=field delim="
"

jmartinf5
Engager

That didn't work. Still ended up with the same result. I think this is because the rex was on the _raw log and the makemv was on the parsed field. So I changed it a bit and got it to work.

* | rex field=httpRequest mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval userAgent=mvindex(httpRequest,1) | table clientIp userAgent

However, the results then showed me that the User-Agent header isn't always the "1" index header. So I used mvfind to get the index of the UA header.

* | rex field=httpRequest mode=sed "s/([\r\n]+)/#LINEBREAK##/g" | makemv delim="#LINEBREAK##" httpRequest | eval n=mvfind(httpRequest,"[Uu]ser-[Aa]gent") | eval userAgent=mvindex(httpRequest,n) | table clientIp userAgent

0 Karma

somesoni2
Revered Legend

Lets take a different direction and see if this works for you.

* |  rex field=httpRequest "(?<userAgent>[Uu]ser-[Aa]gent:[^\r\n]*)Content-Length:" | table clientIp userAgent
0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...