Archive

Licensing Alert (& prediction) when it breached 90% of the overall license

anandhalagaras1
Path Finder

Hi Team,

We have deployed Splunk Cloud in our environment. We have opted 300 GB of licensing per day and in that we are utilizing approx 250 to 270 GB licensing per day till date.

Is it possible to predict the licensing for the upcoming days as well as months?

Do we have any search query or an app to predict the license usage approx. forecast based on the current trends?

If yes kindly help on the same.

Tags (1)
0 Karma

soutamo
SplunkTrust
SplunkTrust

Hi

you can use this even you have several License Masters.

--8<--
index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=license_usage.log type=RolloverSummary
| bin _time span=1d
| stats latest(b) AS b by slave, pool, _time
| timechart span=1d sum(b) AS "volume" fixedrange=false
| join type=outer _time
[ search index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=license_usage.log type=RolloverSummary
| bin _time span=1d
| stats latest(stacksz) AS stacksz by _time, host
| stats sum(stacksz) AS stacksz by _time
| stats latest(stacksz) AS stacksz by _time ]
| fields - _timediff
| streamstats time_window=120d p89(volume) as runningVol max(volume) as maxVol
| where volume > 0
| predict runningVol maxVol volume future_timespan=120 algorithm=LLP5
| foreach "*"
[ eval <>=round('<>'/1024/1024/1024, 3)]

--8<--

Run it e.g. with last 1y until begining of today (not now).

It predicts based on
- current license usage based on 89% of days are under max capacity (allow max 4 times over license). You must find suitable value for that % based on your current usage profile
- max usage

This is modified from Splunk's MC license usage calculation.

r. Ismo

0 Karma

anandhalagaras1
Path Finder

Hi have updated the query with * in license_usage.log but still it throws an error as command="predict", No data

index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=license_usage.log type=RolloverSummary
| bin _time span=1d
| stats latest(b) AS b by slave, pool, _time
| timechart span=1d sum(b) AS "volume" fixedrange=false
| join type=outer _time
[ search index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=license_usage.log type=RolloverSummary
| bin _time span=1d
| stats latest(stacksz) AS stacksz by _time, host
| stats sum(stacksz) AS stacksz by _time
| stats latest(stacksz) AS stacksz by _time ]
| fields - _timediff
| streamstats time_window=120d p89(volume) as runningVol max(volume) as maxVol
| where volume > 0
| predict runningVol maxVol volume future_timespan=120 algorithm=LLP5
| foreach ""
[ eval <>=round('<>'/1024/1024/1024, 3)]

So kindly help please.

0 Karma

anandhalagaras1
Path Finder

@soutamo,

When i ran the query from 18th Feb 2019 to 18th Feb 2020 it throws an error as command="predict", No data

So kindly help on this.

0 Karma

soutamo
SplunkTrust
SplunkTrust

Hi this assumes that you have that data in _internal index. If your retention time for that is smaller then this didn't work. Splunk defaults for this is quite small if I recall right.

Just try to see if you have that data:

index=_internal source=license_usage.log type=RolloverSummary earliest=-2y latest=now
| timechart span=1mon count

Another place to check it is MC and there Indexes & volumes.

r. Ismo

0 Karma

soutamo
SplunkTrust
SplunkTrust

What this query shows:

index=_internal source=*license_usage.log type=RolloverSummary earliest=-1y latest=now
| timechart span=1mon count

0 Karma

anandhalagaras1
Path Finder

This query shows the stats in months from last year till this month in count .

0 Karma

soutamo
SplunkTrust
SplunkTrust

If you don't know your LM then just remove "(host=LIC_SRV_1 OR host=LIC_SRV_2)" those from above query (on both main and subquery). Then it should work.

 index=_internal  source=*license_usage.log* type=RolloverSummary 
 | bin _time span=1d 
 | stats latest(b) AS b by slave, pool, _time 
 | timechart span=1d sum(b) AS "volume" fixedrange=false 
 | join type=outer _time 
     [ search index=_internal source=*license_usage.log* type=RolloverSummary 
     | bin _time span=1d 
     | stats latest(stacksz) AS stacksz by _time, host 
     | stats sum(stacksz) AS stacksz by _time 
     | stats latest(stacksz) AS stacksz by _time ] 
 | fields - _timediff 
 | streamstats time_window=30d p89(volume) as runningVol max(volume) as maxVol 
 | where volume > 0 
 | predict runningVol maxVol volume future_timespan=120 algorithm=LLP5 
 | foreach "*" 
     [ eval <<FIELD>>=round('<<FIELD>>'/1024/1024/1024, 3)]
0 Karma

anandhalagaras1
Path Finder

@soutamo

I ran the query & there are no results when i ran for All time, or even Last 30 days, or last 24 hours.

index=_internal source=license_usage.log type=RolloverSummary earliest=-2y latest=now
| timechart span=1mon count

So how can i check the prediction for future?

0 Karma

soutamo
SplunkTrust
SplunkTrust

Then you probably haven't access to this _internal index? Without current data you cannot predict it with this query.

r. Ismo

0 Karma

anandhalagaras1
Path Finder

When i ran the query with _internal i can see the logs and kindly note i am the admin of our Splunk Cloud and i can able to see the logs

0 Karma

anandhalagaras1
Path Finder

I have corrected the query and now i can able to see logs.

index=_internal source=*license_usage.log type=RolloverSummary earliest=-2y latest=now
| timechart span=1mon count

I have ran it for all time and the data has been split in months and it shows in count.

so how can i predict for future.

0 Karma

anandhalagaras1
Path Finder

Hi have updated the query with * in license_usage.log but still it throws an error as command="predict", No data

index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=license_usage.log type=RolloverSummary
| bin _time span=1d
| stats latest(b) AS b by slave, pool, _time
| timechart span=1d sum(b) AS "volume" fixedrange=false
| join type=outer _time
[ search index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=license_usage.log type=RolloverSummary
| bin _time span=1d
| stats latest(stacksz) AS stacksz by _time, host
| stats sum(stacksz) AS stacksz by _time
| stats latest(stacksz) AS stacksz by _time ]
| fields - _timediff
| streamstats time_window=120d p89(volume) as runningVol max(volume) as maxVol
| where volume > 0
| predict runningVol maxVol volume future_timespan=120 algorithm=LLP5
| foreach "
"
[ eval <>=round('<>'/1024/1024/1024, 3)]

So kindly help please.

0 Karma

soutamo
SplunkTrust
SplunkTrust

Have you change those LIC_SRV_1 and LIC_SRV_2 to your current license master? And if you have only one LM then just LIC_SRV_2 away from main and subquery.

aah. it seems that this editor has removed couple of * away from that query 😞

index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=*license_usage.log* type=RolloverSummary 
| bin _time span=1d 
| stats latest(b) AS b by slave, pool, _time 
| timechart span=1d sum(b) AS "volume" fixedrange=false 
| join type=outer _time 
    [ search index=_internal (host=LIC_SRV_1 OR host=LIC_SRV_2) source=*license_usage.log* type=RolloverSummary 
    | bin _time span=1d 
    | stats latest(stacksz) AS stacksz by _time, host 
    | stats sum(stacksz) AS stacksz by _time 
    | stats latest(stacksz) AS stacksz by _time ] 
| fields - _timediff 
| streamstats time_window=30d p89(volume) as runningVol max(volume) as maxVol 
| where volume > 0 
| predict runningVol maxVol volume future_timespan=120 algorithm=LLP5 
| foreach "*" 
    [ eval <<FIELD>>=round('<<FIELD>>'/1024/1024/1024, 3)]
0 Karma

anandhalagaras1
Path Finder

Hi Ismo,

Thank you I have modified the host name field with our cluster master name then i can able to see some results. I have ran the query from last April 2019 till yesterday Feb 18 2020 and i can able to see around 186 data as statistics and 500+ Events.

Actually there are multiple fields such as mentioned below:
When i switch to statistics tab there is a table format with following details.
_ time : In this field i can able to see in April 2019 1 stats data, July 2019 1 stats data and in Aug 1 stats data then from Dec 20th 2019 till data i have the statistics information.

volume : It explains about the daily license usage which we have utilized till date.

Can you kindly explain how this fields are getting extracted and also please explain the LLP5 algorithm

lower95(prediction(maxVol))
lower95(prediction(runningVol))
lower95(prediction(volume))
maxVol
prediction(maxVol)
prediction(runningVol)

prediction(volume)
runningVol
stacksz
upper95(prediction(maxVol))
upper95(prediction(runningVol))
upper95(prediction(volume))

0 Karma

soutamo
SplunkTrust
SplunkTrust

Hi

Easiest way to look this is to use Visualization tab. It shows graphically current and predicted values. For more detailed level information can see on Statistics tab.

Most of those you can found from https://docs.splunk.com/Documentation/Splunk/8.0.2/SearchReference/Predict.

maxVol - maximum daily indexed data (from beginning of search time)
runningVol - daily indexed data per day
stacksz - Current Splunk License size per day

And if you need more information about those prediction algorithms then e.g. wikipedia helps you.

r. Ismo

0 Karma

anandhalagaras1
Path Finder

Thank you Ismo.

The provided link seems to be not working let me google it as you mentioned.

0 Karma

soutamo
SplunkTrust
SplunkTrust

Please drop the last dot . away, then it works.

0 Karma

soutamo
SplunkTrust
SplunkTrust

Predict needs that you have enough many "bucket" existing data for predicting. Please check it from documentation and then update those values which I have used based on how much current data you have.

t. Ismo

0 Karma

nickhills
Ultra Champion

To configure your own alert based on your licence usage you can use a search like this (borrowed from the licence reports)

(index=_internal source=*license_usage.log* type="RolloverSummary") 
| bin _time span=1d 
| stats latest(b) AS b latest(stacksz) AS stacksz by slave, pool, _time 
| stats sum(b) AS volumeB max(stacksz) AS stacksz by _time 
| eval pctused=round(((volumeB / stacksz) * 100),2) 
| timechart span=1d max(pctused) AS "used" fixedrange=false
| where used>90

This search will produce a result for each date where you exceeded the threshold where used>90

That is based of your historic data - fine if you want to be told that you breached your threshold yesterday.
If you want to know (in advance) that you are likely to breach it today, see this post:
https://answers.splunk.com/answers/35926/email-actions-for-builtin-licensing-alerts.html#3593

The advantage of using historic data is that it allows you to use the predict command. A very basic version which will give a good guess at future usage then looks like this: (switch to visualisation tab)

(index=_internal source=*license_usage.log* type="RolloverSummary") 
| bin _time span=1d 
| stats latest(b) AS b latest(stacksz) AS stacksz by slave, pool, _time 
| stats sum(b) AS volumeB max(stacksz) AS stacksz by _time 
| eval pctused=round(((volumeB / stacksz) * 100),2) 
| timechart span=1d max(pctused) AS "used" fixedrange=false
| predict used
If my comment helps, please give it a thumbs up!
0 Karma

anandhalagaras1
Path Finder

@Nick,

If i use the 1st query the results are popping as events and there is no stats for that with the information. I have searched for last 30 days but the data are in the form of events and there are no stats. Yesterday also it exceeded above 90% but still i cant able to fetch the data.

Similarly if i use the 2nd query the predict used should predict the future forecast trends i.e. it should be use the previous stats and able to give a forecast for Mar and April 2020.

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!