Solved: Looking for a License Usage query to generate exte...

awurster · ‎07-05-2015

hi guys i'm looking for help around license usage.

i'm trying to troubleshoot a license violation we had recently where some of our warnings went unnoticed, because they had no context and were running at the incorrect time of day. i want to rewrite our searches now to be more robust / contextual - and the DMC searches either doesn't cut it for us, or i'm not sure how to effectively read / use them.

honestly, license usage in Splunk's a bit of a mess now (IMHO) - some searches are using rest, some "dmc" and others still using "internal" index with license logs... so i'm a bit lost. there's also a lot of time modifiers and joining of searches in the "stock" reports which seem to be overkill or are just confusing me entirely so i can't get what i need.

that said, i got close with the following search, but the calculated eval fields aren't showing up. earliest is -31d@d and latest is @d

index=_internal source=*license_usage.log type="RolloverSummary"
  | bin _time span=1d
  | stats sum(b) AS used max(stacksz) AS quota by _time
  | where used > quota
  | eval usedGB=round(used/1024/1024/1024,3) 
  | eval quotaGB=round(quota/1024/1024/1024,3)
  | table _time usedGB quotaGB
  | eval percentage=round(usedGB / totalGB, 1) * 100
  | eval usage = usedGB . " (" . percentage . "%)"
  | fields _time usedGB quotaGB usage

i want to have two flavors for my alerts:

a violation "notice" that between 1 and 3 notices have occurred (we can group them on the receiving end)
an outage is about to happen / has happened, with between 4 and 5 violations

both reports should be pretty similar, so i'd like to meet the following requirements for both:

run as close to the "splunk" calculations as possible (should i run this at 12:30 AM, 1 AM, etc?), to avoid a lag or incorrect calculation
include instance name (i.e. search head name)
include pool / stack size
include a usage percentage (i.e used GB / quota GB)
have a count tally over the 30 day period to see how many violations i've had in that rolling period

awurster · ‎08-04-2015

OK so i've broken this into a few different alerts to get what i need. hopefully as we scale out to using different license pools, this will translate nicely.

1 - a twice hourly warning if at any point in the day, we go above 90% utilisation.

| rest splunk_server_group=dmc_group_license_master /services/licenser/pools
| join type=outer stack_id splunk_server [rest splunk_server_group=dmc_group_license_master /services/licenser/groups | search is_active=1 | eval stack_id=stack_ids | fields splunk_server stack_id is_active] 
| search is_active=1 
| fields splunk_server, stack_id, used_bytes 
| join type=outer stack_id splunk_server [rest splunk_server_group=dmc_group_license_master /services/licenser/stacks | eval stack_id=title | eval stack_quota=quota | fields splunk_server stack_id stack_quota] 
| stats sum(used_bytes) as used_bytes max(stack_quota) as stack_quota by splunk_server 
| eval usedGB=round(used_bytes/1024/1024/1024,1) 
| eval totalGB=round(stack_quota/1024/1024/1024,1) 
| eval percentage=round(usedGB / totalGB, 3)*100 
| fields splunk_server, stack_id, percentage, usedGB, totalGB 
| where percentage > 90 
| rename splunk_server AS Instance, percentage AS "License quota used (%)", usedGB AS "License quota used (GB)", totalGB as "Total license quota (GB)"

2 - a once daily warning between 1 and 4 license violations in 30 day average (note search is "all time" because this log is only kept for 30-day average anyhow)

index=_internal source=*license_usage.log type="RolloverSummary"
  | bin _time span=1d
  | convert timeformat="%F" ctime(_time) AS date
  | stats sum(b) AS used max(stacksz) AS quota by date, pool, stack
  | eval usedGB=round(used/1024/1024/1024,3) 
  | eval quotaGB=round(quota/1024/1024/1024,3)
  | eval usedPct = round(usedGB / quotaGB, 1) * 100
  | where usedPct > 60
  | eval violation_id=1
  | eval usage = usedGB . " (" . usedPct . "%)"
  | streamstats global=f sum(violation_id) AS violations
  | fields date stack pool usedGB quotaGB usage violations
  | rename usedGB AS "used", quotaGB AS "quota"

3 - a final alert (slightly different title and severity) for the 4th (and 5th if you get there) violations. same code as above, but different counts.

as for points num 2 and 3... i can add a tail at the end to just grab the last line, so that in my alert system, i see the alerts come in one at a time. like:

...
  | fields date stack pool usedGB quotaGB usage violations
  | tail 1
  | rename usedGB AS "used", quotaGB AS "quota"
...

View solution in original post

awurster · ‎08-04-2015

OK so i've broken this into a few different alerts to get what i need. hopefully as we scale out to using different license pools, this will translate nicely.

1 - a twice hourly warning if at any point in the day, we go above 90% utilisation.

| rest splunk_server_group=dmc_group_license_master /services/licenser/pools
| join type=outer stack_id splunk_server [rest splunk_server_group=dmc_group_license_master /services/licenser/groups | search is_active=1 | eval stack_id=stack_ids | fields splunk_server stack_id is_active] 
| search is_active=1 
| fields splunk_server, stack_id, used_bytes 
| join type=outer stack_id splunk_server [rest splunk_server_group=dmc_group_license_master /services/licenser/stacks | eval stack_id=title | eval stack_quota=quota | fields splunk_server stack_id stack_quota] 
| stats sum(used_bytes) as used_bytes max(stack_quota) as stack_quota by splunk_server 
| eval usedGB=round(used_bytes/1024/1024/1024,1) 
| eval totalGB=round(stack_quota/1024/1024/1024,1) 
| eval percentage=round(usedGB / totalGB, 3)*100 
| fields splunk_server, stack_id, percentage, usedGB, totalGB 
| where percentage > 90 
| rename splunk_server AS Instance, percentage AS "License quota used (%)", usedGB AS "License quota used (GB)", totalGB as "Total license quota (GB)"

2 - a once daily warning between 1 and 4 license violations in 30 day average (note search is "all time" because this log is only kept for 30-day average anyhow)

index=_internal source=*license_usage.log type="RolloverSummary"
  | bin _time span=1d
  | convert timeformat="%F" ctime(_time) AS date
  | stats sum(b) AS used max(stacksz) AS quota by date, pool, stack
  | eval usedGB=round(used/1024/1024/1024,3) 
  | eval quotaGB=round(quota/1024/1024/1024,3)
  | eval usedPct = round(usedGB / quotaGB, 1) * 100
  | where usedPct > 60
  | eval violation_id=1
  | eval usage = usedGB . " (" . usedPct . "%)"
  | streamstats global=f sum(violation_id) AS violations
  | fields date stack pool usedGB quotaGB usage violations
  | rename usedGB AS "used", quotaGB AS "quota"

3 - a final alert (slightly different title and severity) for the 4th (and 5th if you get there) violations. same code as above, but different counts.

as for points num 2 and 3... i can add a tail at the end to just grab the last line, so that in my alert system, i see the alerts come in one at a time. like:

...
  | fields date stack pool usedGB quotaGB usage violations
  | tail 1
  | rename usedGB AS "used", quotaGB AS "quota"
...

Looking for a License Usage query to generate external scripted alerts with

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

Announcing Modern Navigation: A New Era of Splunk User Experience

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

Join the Conversation