Splunk Search

Nginx log parsing

intachur
Explorer

Hi!

I would like to extract fields from my nginx access log which was configured so:

'[ $connection : $msec : $request_time : $bytes_sent ] '
'$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"'

Now I need to extract $connection, $msec, $request_time, $bytes_sent and probably $remote_addr values to Splunk fields to make some analysis. Could you please anybody give me an input how I can do it? I guess I have to use Regexp (rex command), but I wasn't successful with this command 😞

The sample of output is:

[ 533297 : 1333487468.121 : 1.170 : 380374 ] 127.0.0.0 - - [04/Apr/2012:01:11:08 +0400] "GET /data HTTP/1.1" 200 380136 "-" "-"

Thanks in advance
Sergey

Tags (1)
0 Karma
1 Solution

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

View solution in original post

intachur
Explorer

The question is closed 🙂 My regexp is:

source="/var/log/nginx/access.log" | rex field=_raw "^\[\s+(?<connection>\d+)\s+:\s+(?<exec_time_msec>\d+.\d+)\s+:\s+(?<request_time>\d+.\d+)\s+:\s+(?<bytes_sent>\d+)\s+\]"

awurster
Contributor

i added in something similar. our logs were same as the standard format described in the docs. i didn't know / care about the last two fields, however i grouped them together for you for future reference (they could easily be whacked off). note i also used their field names, rather than apache, squid, etc.

(?P<remote_addr>[\d\.]+)\s-\s(?P<remote_user>\S+)\s\[.+\]\s+(?<request>.+\sHTTP/\d\.\d)\s(?P<status>\d+)\s(?P<bytes_sent>\d+)\s\"(?P<http_referer>[^\"]+)\"\s\"(?P<http_user_agent>[^\"]+)\"\s(\S+)\s(\S+)
0 Karma

intachur
Explorer

First time I thought as you 8-) Nginx terms are strange sometimes, below the quote from Nginx docs:

$request_time - request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client

$msec - time in seconds with a milliseconds resolution at the time of log write

0 Karma

kristian_kolb
Ultra Champion

Are you sure tou haven't switched places between $msec and $request_time?

1333487468.121 looks like epoch to me...and it matches the timestamp in event rather well.

/k

0 Karma

kristian_kolb
Ultra Champion

Have you tried the Interactive Field Extractor (IFX)?

It's a wizard that can help you with your regex generation. Just look in the search app next to the timestamp of an event, there will be a sort of down-pointing arrow. Clicking it will give you the option to "Extract Fields".

For more information, see
http://docs.splunk.com/Documentation/Splunk/4.3.1/User/InteractiveFieldExtractionExample

Hope this helps,

Kristian

0 Karma

intachur
Explorer

Yes, I tried but it didn't give me that I need 😞

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...