Splunk Search

Nginx log parsing

intachur
Explorer

Hi!

I would like to extract fields from my nginx access log which was configured so:

'[ $connection : $msec : $request_time : $bytes_sent ] '
'$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"'

Now I need to extract $connection, $msec, $request_time, $bytes_sent and probably $remote_addr values to Splunk fields to make some analysis. Could you please anybody give me an input how I can do it? I guess I have to use Regexp (rex command), but I wasn't successful with this command 😞

The sample of output is:

[ 533297 : 1333487468.121 : 1.170 : 380374 ] 127.0.0.0 - - [04/Apr/2012:01:11:08 +0400] "GET /data HTTP/1.1" 200 380136 "-" "-"

Thanks in advance
Sergey

Tags (1)
0 Karma
1 Solution

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

View solution in original post

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

awurster
Contributor

i added in something similar. our logs were same as the standard format described in the docs. i didn't know / care about the last two fields, however i grouped them together for you for future reference (they could easily be whacked off). note i also used their field names, rather than apache, squid, etc.

(?P<remote_addr>[\d\.]+)\s-\s(?P<remote_user>\S+)\s\[.+\]\s+(?<request>.+\sHTTP/\d\.\d)\s(?P<status>\d+)\s(?P<bytes_sent>\d+)\s\"(?P<http_referer>[^\"]+)\"\s\"(?P<http_user_agent>[^\"]+)\"\s(\S+)\s(\S+)
0 Karma

intachur
Explorer

First time I thought as you 8-) Nginx terms are strange sometimes, below the quote from Nginx docs:

$request_time - request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client

$msec - time in seconds with a milliseconds resolution at the time of log write

0 Karma

kristian_kolb
Ultra Champion

Are you sure tou haven't switched places between $msec and $request_time?

1333487468.121 looks like epoch to me...and it matches the timestamp in event rather well.

/k

0 Karma

kristian_kolb
Ultra Champion

Have you tried the Interactive Field Extractor (IFX)?

It's a wizard that can help you with your regex generation. Just look in the search app next to the timestamp of an event, there will be a sort of down-pointing arrow. Clicking it will give you the option to "Extract Fields".

For more information, see
http://docs.splunk.com/Documentation/Splunk/4.3.1/User/InteractiveFieldExtractionExample

Hope this helps,

Kristian

0 Karma

intachur
Explorer

Yes, I tried but it didn't give me that I need 😞

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...