Splunk Search

Using external lookup and mstats together

zahorans
New Member

Hi All,

I have a search like this:

| mstats span=1d sum(_value) as "ClosedTime" WHERE index=metrics_prod metric_name=com.foo.timeClosed | eval ClosedTimeinMin = ((ClosedTime/1000)/60) | table _time ClosedTimeinMin

It basically shows how long a device in a shop was closed (out of order) a day:

"_time",ClosedTimeinMin
"2019-09-25T00:00:00.000+0200","0.27218333333"
"2019-09-26T00:00:00.000+0200","528.49098333333"
"2019-09-27T00:00:00.000+0200","1077.4227000000"
"2019-09-30T00:00:00.000+0200","3410.40781666667"
"2019-10-01T00:00:00.000+0200","533.04851666667"

The problem is that it contains those time periods as well when the shop was closed. When the shop is closed and the device is off it is not a problem so this time period should not be in the report. I can call a REST API which returns the opening hours of the shop. This Python script (which calls the REST API) is written according to the rules described here (external lookup): https://docs.splunk.com/Documentation/Splunk/7.3.1/Knowledge/Configureexternallookups
It was written by me so it can return the opening hours in any format. How could I use this script in the search in order to exclude the hours when the shop was closed? Can I use the "lookup" command together with "mstats" command? And if yes how should I phrase the query in order to achieve the desired result?

Thanks,
Ivan

0 Karma

ololdach
Builder

Hi zahorans,

it should work like this and my apologies for using "meta-code" to sketch the idea:

| mstats span=1d sum(_value) as "ClosedTime" WHERE index=metrics_prod metric_name=com.foo.timeClosed
| <external.py returning fields openingTime, closingTime as epoch (integer)>    <<< Add the opening hours 
| where _time>=openingTime AND _time<=closingTime    <<< Select only the events that fall between open and close
| stats sum(ClosedTimeinMin) as TotalOutage <<< Sum up all outages to get the total Outage time 

Alternatively, you could add some additional parameter to the metrics, detailing the failed device/source and add a by to the stats to get the outages per device.

This is just a rough sketch. To make it perfect, you would rather test if the end of the outage was inside the opening hours, clipping the outage length to the openingTime whenever the outage started before store open. The same could be done for all outages that start while open: clip their length at shop close. It gets tricky if an outage spans several days... but you get the point.

Hope it helps
Oliver

0 Karma

zahorans
New Member

Hi Oliver,
Thanks for your answer. I am waiting for our Splunk admin to let me upload the script then I try what you suggested.
Cheers,
Ivan

0 Karma

zahorans
New Member

I tried to use an already existing lookup though and it didn't work:

    | mstats span=1d sum(_value) as "ClosedTime" WHERE index=till_sre_metrics_prod metric_name=com.tesco.ui.prod.timeOnClosedPage | lookup store_lookup.csv host AS host OUTPUTNEW store_number AS store_number | fields store_number

the "store_number" field doesn't appear. Why? "Host" field should be available even in case of stats.

0 Karma

zahorans
New Member

It might be because of this statement: "You cannot use automatic lookups with metrics data."
https://docs.splunk.com/Documentation/Splunk/7.3.1/Metrics/Search

0 Karma

zahorans
New Member

If this is true then I cannot really solve my original problem.

0 Karma

zahorans
New Member

Thanks Oliver,
I am waiting for our Splunk admin to let me upload the script then I try what you suggested.
Cheers,
Ivan

0 Karma

zahorans
New Member

I have the feeling that it is not even possible this way. Any other technique/extension/plugin/etc which might work?

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...