I need to get average license utilization per sourcetype and host for 30 days for a particular index and I was trying this, which I had got from answers.com but not sure whether this query is correct.
index=_internal source=*license_usage.log* type="Usage" idx="xxx" earliest=-30d@d latest=@d
| eval h=if(len(h)=0 OR isnull(h),"(SQUASHED)",h)
| eval s=if(len(s)=0 OR isnull(s),"(SQUASHED)",s)
| eval st=if(len(st)=0 OR isnull(st),"(UNKNOWN)",st)
| fields _time, pool, b,h,st
| bin _time span=1d
| stats sum(b) as b by _time, pool,h,st
| stats sum(b) AS volume by h, _time,st
| stats avg(volume) AS avgVolume max(volume) AS maxVolume by h,st
| eval avgVolumeGB=round(avgVolume/1024/1024/1024,3)
| eval maxVolumeGB=round(maxVolume/1024/1024/1024,3)
| fields h,st, avgVolumeGB, maxVolumeGB
| rename avgVolumeGB AS "average" maxVolumeGB AS "peak",st AS "sourcetype", h AS "hostname"
| sort -sourcetype, hostname
| head 10
Hi @Hemnaath,
If the number of distinct tuples (host, source, sourcetype, index) grows beyond a configurable threshold, the host and source values are automatically squashed. This threshold is 2000 default. In order to see all host and source values, you need to increase this setting. Please increase this value like
Hey I am getting the avg/peak license usage details per sourcetype but in case of hostname field value I am able to see the (SQUASHED) detail as the hostname not the exact hostname which is consuming more license.
The query works fine in my environment. I see host names rather than "(SQUASHED)".
Run the base query and verify the host name is as expected.
I ran the query in the deployment server still getting some of the hostname as SQUASHED.
index=_internal source=*license_usage.log* type="Usage" idx="xxx" earliest=-30d@d latest=@d
| eval h=if(len(h)=0 OR isnull(h),"(SQUASHED)",h)
| eval s=if(len(s)=0 OR isnull(s),"(SQUASHED)",s)
| eval st=if(len(st)=0 OR isnull(st),"(UNKNOWN)",st)
| fields _time, pool, b,h,st
| bin _time span=1d
| stats sum(b) as b by _time, pool,h,st
| stats sum(b) AS volume by h, _time,st
| stats avg(volume) AS avgVolume max(volume) AS maxVolume by h,st
| eval avgVolumeGB=round(avgVolume/1024/1024/1024,3)
| eval maxVolumeGB=round(maxVolume/1024/1024/1024,3)
| fields h,st, avgVolumeGB, maxVolumeGB
| rename avgVolumeGB AS "average" maxVolumeGB AS "peak",st AS "sourcetype", h AS "hostname"
| sort -average
I think I see the problems. First, you ran the query on the DS. Run it on a search head and see if you get better results. Second, you ran the entire query rather than just the base query (up to the first pipe | ).
I ran the base query in the splunk search head and found the list of hostname associated with the particular index name.
index=_internal source=*license_usage.log* type="Usage" idx="xxx" earliest=-30d@d latest=@d
| table _time, pool,b,h,st
But when I wanted to calculate the some of bytes consumed by host and sourcetype with _time I am able see some of the hostname field values are empty.
index=_internal source=*license_usage.log* type="Usage" idx="xxx" earliest=-30d@d latest=@d
| table _time, pool,b,h,st
| bin _time span=1d
| stats sum(b) as b by _time, pool,h,st
I'm not sure what to tell you. The license_usage log should have host names in it.
I don't see how the pool field adds any value. I believe the three stats commands can be reduced to two.
| stats sum(b) AS volume by _time,h,st
| stats avg(volume) AS avgVolume max(volume) AS maxVolume by h,st
I expected a query like this to sort the results by volume rather than alphabetically, especially if showing only 10 results.