Splunk Search

How to sum by combination of specific fields?

ahofmann
Explorer

I have an index of access logs and I want to see how many download events with a specific combination of 'ip', 'filename', 'date_mday', 'date_month', and date_year' exceed 1000 'bytes'

The following query gives me believable counts

index=logs sourcetype=logs
| stats sum(Bytes) as TotalBytes by ip, filename, date_mday, date_month, date_year
| where TotalBytes > 1000
| stats count by filename

but it seems like I should be using eventstats like

index=logs sourcetype=logs
| eventstats sum(Bytes) as TotalBytes by ip, filename, date_mday, date_month, date_year
| where TotalBytes > 1000
| stats count by filename

but whenever I do this, it gives me a much smaller number for each filename. I eventually want to take the TotalBytes of these downloads and see how many minutes of content is downloaded using each file's bitrate, so it's important that the TotalBytes is correct. Why is it more appropriate to use stats than eventstats?

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

How many rows does your base search have? The eventstats command have limitation on memory usage and max result rows (see limits.conf, search for eventstats), so that might explain incorrect results if there are high number events to be processed. For your scenario, your first implementation, using stats, is the correct and optimal method.

View solution in original post

logloganathan
Motivator

Please use this query. you will get the result

index=logs sourcetype=logs
| eventstats sum(Bytes) as TotalBytes by Bytes,ip, filename, date_mday, date_month, date_year
| where TotalBytes > 1000
| stats count by filename

Please provide your response

0 Karma

ahofmann
Explorer

Unfortunately, this gave me way too high of a count. I believe this is because it's creating a new TotalBytes for each different Bytes within the same download. This doesn't achieve a threading of downloads by the same field, it just gives a much larger count by magnitude of the number of download requests it takes to complete one unique download.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

How many rows does your base search have? The eventstats command have limitation on memory usage and max result rows (see limits.conf, search for eventstats), so that might explain incorrect results if there are high number events to be processed. For your scenario, your first implementation, using stats, is the correct and optimal method.

ahofmann
Explorer

It has hundreds of thousands of rows, so that actually would make sense. Thank you!

0 Karma

niketn
Legend

@ahofmann, I have converted @somesoni2 's comment to answer. Please accept the mark this question as answered!

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

ahofmann
Explorer

@niketnilay, I am reviewing @logloganathan's answer and will mark the one that worked best as the answer! Thanks

0 Karma

niketn
Legend

@ahofmann, I think @somesoni's point was that eventstats would be resource consuming command. If you can achieve same results from stats, then you should use the same!

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...