Splunk Search

Nginx log parsing

intachur
Explorer

Hi!

I would like to extract fields from my nginx access log which was configured so:

'[ $connection : $msec : $request_time : $bytes_sent ] '
'$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"'

Now I need to extract $connection, $msec, $request_time, $bytes_sent and probably $remote_addr values to Splunk fields to make some analysis. Could you please anybody give me an input how I can do it? I guess I have to use Regexp (rex command), but I wasn't successful with this command 😞

The sample of output is:

[ 533297 : 1333487468.121 : 1.170 : 380374 ] 127.0.0.0 - - [04/Apr/2012:01:11:08 +0400] "GET /data HTTP/1.1" 200 380136 "-" "-"

Thanks in advance
Sergey

Tags (1)
0 Karma
1 Solution

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

View solution in original post

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

awurster
Contributor

i added in something similar. our logs were same as the standard format described in the docs. i didn't know / care about the last two fields, however i grouped them together for you for future reference (they could easily be whacked off). note i also used their field names, rather than apache, squid, etc.

(?P<remote_addr>[\d\.]+)\s-\s(?P<remote_user>\S+)\s\[.+\]\s+(?<request>.+\sHTTP/\d\.\d)\s(?P<status>\d+)\s(?P<bytes_sent>\d+)\s\"(?P<http_referer>[^\"]+)\"\s\"(?P<http_user_agent>[^\"]+)\"\s(\S+)\s(\S+)
0 Karma

intachur
Explorer

First time I thought as you 8-) Nginx terms are strange sometimes, below the quote from Nginx docs:

$request_time - request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client

$msec - time in seconds with a milliseconds resolution at the time of log write

0 Karma

kristian_kolb
Ultra Champion

Are you sure tou haven't switched places between $msec and $request_time?

1333487468.121 looks like epoch to me...and it matches the timestamp in event rather well.

/k

0 Karma

kristian_kolb
Ultra Champion

Have you tried the Interactive Field Extractor (IFX)?

It's a wizard that can help you with your regex generation. Just look in the search app next to the timestamp of an event, there will be a sort of down-pointing arrow. Clicking it will give you the option to "Extract Fields".

For more information, see
http://docs.splunk.com/Documentation/Splunk/4.3.1/User/InteractiveFieldExtractionExample

Hope this helps,

Kristian

0 Karma

intachur
Explorer

Yes, I tried but it didn't give me that I need 😞

0 Karma
Get Updates on the Splunk Community!

Dashboards: Hiding charts while search is being executed and other uses for tokens

There are a couple of features of SimpleXML / Classic dashboards that can be used to enhance the user ...

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

This is the fourth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

When you think of Boston, you might picture colonial charm, world-class universities, or even the crack of a ...