Splunk Search

Nginx log parsing

intachur
Explorer

Hi!

I would like to extract fields from my nginx access log which was configured so:

'[ $connection : $msec : $request_time : $bytes_sent ] '
'$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"'

Now I need to extract $connection, $msec, $request_time, $bytes_sent and probably $remote_addr values to Splunk fields to make some analysis. Could you please anybody give me an input how I can do it? I guess I have to use Regexp (rex command), but I wasn't successful with this command 😞

The sample of output is:

[ 533297 : 1333487468.121 : 1.170 : 380374 ] 127.0.0.0 - - [04/Apr/2012:01:11:08 +0400] "GET /data HTTP/1.1" 200 380136 "-" "-"

Thanks in advance
Sergey

Tags (1)
0 Karma
1 Solution

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

View solution in original post

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

awurster
Contributor

i added in something similar. our logs were same as the standard format described in the docs. i didn't know / care about the last two fields, however i grouped them together for you for future reference (they could easily be whacked off). note i also used their field names, rather than apache, squid, etc.

(?P<remote_addr>[\d\.]+)\s-\s(?P<remote_user>\S+)\s\[.+\]\s+(?<request>.+\sHTTP/\d\.\d)\s(?P<status>\d+)\s(?P<bytes_sent>\d+)\s\"(?P<http_referer>[^\"]+)\"\s\"(?P<http_user_agent>[^\"]+)\"\s(\S+)\s(\S+)
0 Karma

intachur
Explorer

First time I thought as you 8-) Nginx terms are strange sometimes, below the quote from Nginx docs:

$request_time - request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client

$msec - time in seconds with a milliseconds resolution at the time of log write

0 Karma

kristian_kolb
Ultra Champion

Are you sure tou haven't switched places between $msec and $request_time?

1333487468.121 looks like epoch to me...and it matches the timestamp in event rather well.

/k

0 Karma

kristian_kolb
Ultra Champion

Have you tried the Interactive Field Extractor (IFX)?

It's a wizard that can help you with your regex generation. Just look in the search app next to the timestamp of an event, there will be a sort of down-pointing arrow. Clicking it will give you the option to "Extract Fields".

For more information, see
http://docs.splunk.com/Documentation/Splunk/4.3.1/User/InteractiveFieldExtractionExample

Hope this helps,

Kristian

0 Karma

intachur
Explorer

Yes, I tried but it didn't give me that I need 😞

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...