I get different results when I run the following searches:
index=_internal source=*license_usage.log type="RolloverSummary" earliest=-1d@d latest=-0d@d | bin _time span=1d | stats sum(b) AS volumeB by _time
gives me 50 GB (of course in bytes though).
index=_internal source=*license_usage.log* type=Usage earliest=-1d@d latest=-0d@d | eval s=if(s=="","unknown",s) | eval h=if(h=="","unknown",h) | bucket _time span=1d | stats sum(b) AS volume_b
gives me 90 GB (of course in bytes though)
Where did the remaining 40 GB data go?
Some points that may help the Splunk users for answering:
1. I restarted Splunk on universal forwarders. After that for next 20 min period, the
per_index_thruput in metrics.log shows spikes.
2. When I run the searches on my raw index, for the period when there are spikes in
per_index_thruput, I do not see any duplicate log events.
So where did the log events go?
My daily usage is 50GB only, why does the type=Usage gives me 90GB
Your first query references; "source=license_usage.log"
Your second uses; "source=license_usage.log*"
Was this intentional? Did you mean to search;
2=licenseusage.log, licenseusage.log.1 and licenseusage.log.2, etc
Some older question but the "which type counts what" topic is still current. Try the following two commands as examples:
type=Usage - data from the day before yesterday
index=_internal source=/opt/splunk/var/log/splunk/license_usage.log type=Usage earliest=-2d@d latest=-1d@d | stats sum(b)
type=RolloverSummary - data from yesterday, which is the aggregated data of the day before yesterday
index=_internal source=/opt/splunk/var/log/splunk/license_usage.log type=RolloverSummary earliest=-1d@d latest=@d | stats sum(b)
The result is exactly the same - at least on my testsystem. The license usage of one day is aggregated at midnight and added to the
_internal index. Unfortunately I don't know the technical details and I have no documentation link.
You can match the data of both types like the following:
type=Usage type=RolloverSummary =================================== pool pool (the license pool) i slave (the GUID of the indexer as defined in $SPLUNK_HOME/etc/instance.cfg) b b (the byte count - summed up for the previous day the RolloverSummary-type)
type=Usage has some more columns to investigate or differentiate:
h the host sending the data (e.g. a UF) s the source of the data (e.g. /var/log/messages or XmlWinEventLog:Application) st the sourcetype of the data (e.g. syslog or XmlWinEventLog) idx the index where the data is ingested