Hi,
I get different results when I run the following searches:
index=_internal source=*license_usage.log type="RolloverSummary" earliest=-1d@d latest=-0d@d | bin _time span=1d | stats sum(b) AS volumeB by _time
gives me 50 GB (of course in bytes though).
index=_internal source=*license_usage.log* type=Usage earliest=-1d@d latest=-0d@d | eval s=if(s=="","unknown",s) | eval h=if(h=="","unknown",h) | bucket _time span=1d | stats sum(b) AS volume_b
gives me 90 GB (of course in bytes though)
Where did the remaining 40 GB data go?
Some points that may help the Splunk users for answering:
1. I restarted Splunk on universal forwarders. After that for next 20 min period, the per_index_thruput
in metrics.log shows spikes.
2. When I run the searches on my raw index, for the period when there are spikes in per_index_thruput
, I do not see any duplicate log events.
So where did the log events go?
My daily usage is 50GB only, why does the type=Usage gives me 90GB
Thanks,
Strive
Yes, @strive can you please tell me why you have used license_usage.log and license_usage.log* , I know it does make a difference but can you please explain why you used it differently specifically in this case ?
Also can someone please tell the difference between type=RolloverSummary and type=Usage along with addressing @strive 's initial query.
Thank you so much.
Some older question but the "which type counts what" topic is still current. Try the following two commands as examples:
type=Usage
- data from the day before yesterday
index=_internal source=/opt/splunk/var/log/splunk/license_usage.log type=Usage earliest=-2d@d latest=-1d@d
| stats sum(b)
type=RolloverSummary
- data from yesterday, which is the aggregated data of the day before yesterday
index=_internal source=/opt/splunk/var/log/splunk/license_usage.log type=RolloverSummary earliest=-1d@d latest=@d
| stats sum(b)
The result is exactly the same - at least on my testsystem. The license usage of one day is aggregated at midnight and added to the _internal
index. Unfortunately I don't know the technical details and I have no documentation link.
You can match the data of both types like the following:
type=Usage type=RolloverSummary
===================================
pool pool (the license pool)
i slave (the GUID of the indexer as defined in $SPLUNK_HOME/etc/instance.cfg)
b b (the byte count - summed up for the previous day the RolloverSummary-type)
Addionally type=Usage
has some more columns to investigate or differentiate:
h the host sending the data (e.g. a UF)
s the source of the data (e.g. /var/log/messages or XmlWinEventLog:Application)
st the sourcetype of the data (e.g. syslog or XmlWinEventLog)
idx the index where the data is ingested
Your first query references; "source=license_usage.log"
Your second uses; "source=*license_usage.log"
Was this intentional? Did you mean to search;
1=license_usage.log only
2=license_usage.log, license_usage.log.1 and license_usage.log.2, etc
3=Something else.