Having a hard time getting accurate license usage when data is squashed. Is there a way to ensure an accurate measurement?
The numbers are not matching with windows data collection to the perfmon index. Reports only show a few megabytes per host but the index is 10gb daily. I assume the inaccuracy can be blamed on the data squashing of the license usage. If there anyway around this?
I found another post to use throughput as check but it seems you have to divide that number by 3 as an approximation and is still not an a curate measurement of license usage.
Throughput check one host
index="_internal" source="*metrics.log" group="per_host_thruput" host=HOSTHERE | chart sum(kb) by series
License usage for one host
index=_internal source="*license_usage.log" type=usage h=HOSTHERE | eval MB = round(b/1048576,2) | eval st_idx = st.": ".idx | timechart span=1d sum(MB) by st_idx | addtotals
Even if you hit the squash_threshold, your license usage by sourcetype, index, and indexer will be accurate. Only source and host are dropped from the reporting dimensions. (Depending on your index / indexer layout, this may be sufficient for your needs).
Searching for per_X_thruput is of course rather interesting... as it'll report the top 10 series per reporting period, of course host=HOSTHERE means you're getting the thruput off of your forwarder (it'll include any data dropped on the indexer), but host=INDEXER* series=HOSTHERE will only be included if HOSTHERE is in the top 10 hosts for that reporting period... and both of these include _internal logs which do not count toward your license.
A brute force method is of course:
index=* host=HOSTHERE sourcetype!=stash | eval size=len(_raw) | stats sum(size) by sourcetype index
This query kinda sucks when you have a lot of events, and it's making 2 assumptions 1) is that your event timestamp _time is being parsed properly and is close enough to your _indextime (as license usage is done based on _indextime) otherwise you need to search all time (which is awful for sizable indexes) and use _index_earliest and _index_latest to constrain your index time... 2) That all your events are using characters that are represented by single bytes in UTF-8. len gives you the number of characters in a string not bytes, so you wind up with a lower bound of license usage by this method.