First off, before I even ask, let me state that using Splunk on Splunk is not a solution for us as we are trying to produce a report using the Python SDK, run some transformations on the data received and email it out.
Now that the default answer to this question is out of the picture, I am seeing some unusual differences between running two types of indexing queries:
(1) index="_internal" source="*metrics.log" per_host_thruput | stats sum(kb) by series
(2) index=_internal type=Usage st!=splunk_metrics | stats sum(b) by h
The first query produces almost double the total indexing of the second one (yes I did the conversion because the first one is in kb and the second is in bytes). I am also seeing hosts appear in the list for the first query that does not appear in the second query and vice versa.
Can anyone clarify the differences in these two queries?
Should I expect the amount of indexing returned from them to be the same? If not, why shouldn't I?
Hi EricLloyd79,
these two numbers can be the same, but also can be very different. You can read about the thruput
messages here: https://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Aboutmetricslog#Thruput_messages and about the license usage here : https://docs.splunk.com/Documentation/Splunk/latest/Admin/HowSplunklicensingworks#How_data_is_metere...
If it is TL;DR then see it as the thruput
shows how much data (plus protocol over-head) was send by a Splunk instance and the license usage show how much data was actually indexed in the end. As I said, these numbers can be almost the same or differ a lot if you do event transformation or filtering for example.
Hope this makes sense ...
cheers, MuS
Be aware also of the difference between field!=value and NOT field=value. In rare occasions, the difference can trip you up:
https://docs.splunk.com/Documentation/Splunk/7.2.6/Search/NOTexpressions
Hi EricLloyd79,
these two numbers can be the same, but also can be very different. You can read about the thruput
messages here: https://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Aboutmetricslog#Thruput_messages and about the license usage here : https://docs.splunk.com/Documentation/Splunk/latest/Admin/HowSplunklicensingworks#How_data_is_metere...
If it is TL;DR then see it as the thruput
shows how much data (plus protocol over-head) was send by a Splunk instance and the license usage show how much data was actually indexed in the end. As I said, these numbers can be almost the same or differ a lot if you do event transformation or filtering for example.
Hope this makes sense ...
cheers, MuS
Thank you! That makes sense now! Because I see hosts on the first query that actually doesn't index anything we use but Splunk metrics but the second always gets the hosts which actually indexes data we use. Perfect thank you now I know which one to use.
Sadly, for some reason, we have historical indexing data for the first query back 30 days but for the second query it only goes back 12 days.