Hi,
I am getting the memory data from windows server in Splunk every minute
index=main sourcetype="Perfmon:*" counter="Available MBytes" host=* Value=*
| rex field=host "(?<store>\d+)(?<hostType>\w)$"
| stats avg(Value) as AvgRAMUsed_MB by host,hostType
Now my requirement is to mark a server RED, YELLOW based on different time intervals :
Yellow : If the Memory percentage maintains above 75% for 5 minutes, it will be YELLOW.
RED : If available memory is below 100 mb for a sample of 15 minute, then it is RED.
Further down in the query i am calculating the percentage available using join . The final query is like this which also has logic of RED, Yellow , green :
index=main sourcetype="Perfmon:*" counter="Available MBytes" host=* Value=* earliest=-15m@m latest=@m
| rex field=host "(?<field1>\d+)(?<hostType>\w)$"
| stats avg(Value) as AvgRAMUsed_MB by host,hostType
| join host
[ search index=main host=* sourcetype="WMI:HardwareStats" earliest=-1d
| stats latest(TotalPhysicalMemory) as RAMTotal_bytes by host]
| eval RAMTotal_GB=RAMTotal_bytes/1024/1024/1024
| eval RAMAvail_GB = AvgRAMUsed_MB/1024
| eval RAMUsed_GB = RAMTotal_GB-RAMAvail_GB
| eval PercentageRamUsed = round((RAMUsed_GB/RAMTotal_GB) *100,2)
| eval RAG=if(PercentageRamUsed>=90,"RED",if(PercentageRamUsed>=80,"YELLOW","GREEN"))
| eval _time=now()
No my question is how can i combine the logic of above RED, YELLOW in the same query considering the sampling is over different time intervals.
I wrote the below query for RAG status if Memory < 100 Mb for three samples over last 15 minutes.
index=main sourcetype="Perfmon:*" counter="Available MBytes" host=* Value=* earliest=-15m@m latest=@m
| bin _time span=5m
| rex field=host "(?<field1>\d+)(?<hostType>\w)$"
| stats avg(Value) as AvgRAMAvail_MB by _time,host,hostType
| join host
[ search index=main host=* sourcetype="WMI:HardwareStats" earliest=-1d
| stats latest(TotalPhysicalMemory) as RAMTotal_bytes by host]
| eval RAMTotal_MB=RAMTotal_bytes/1024/1024
| eval RAMUsed_MB = (RAMTotal_MB-AvgRAMAvail_MB)
| eval PercentageRamUsed = round((RAMUsed_MB/RAMTotal_MB) *100,2)
| eval Avg_Level=case(AvgRAMAvail_MB>100,"GT_100",AvgRAMAvail_MB<100,"LT_100",1=1,"Default")
| stats count(eval(Avg_Level="GT_100")) as GT_100 count(eval(Avg_Level="LT_100")) as LT_100 by host
| eval RAG=if(LT_100>=3,"RED","GREEN")
Similarly i can write another search to check for percentage of available memory over last 5 minutes, but the big question is how to merge them together in a single search considering the different time intervals.
Wrote the query for checking the percentage > 75
index=main sourcetype="Perfmon:*" counter="Available MBytes" host=* Value=* earliest=-5m@m latest=@m
| rex field=host "(?<field>\d+)(?<hostType>\w)$"
| stats avg(Value) as AvgRAMAvail_MB by host,hostType
| join host
[ search index=main host=* sourcetype="WMI:HardwareStats" earliest=-1d
| stats latest(TotalPhysicalMemory) as RAMTotal_bytes by host]
| eval RAMTotal_MB=RAMTotal_bytes/1024/1024
| eval RAMUsed_MB = (RAMTotal_MB-AvgRAMAvail_MB)
| eval PercentageRamUsed = round((RAMUsed_MB/RAMTotal_MB) *100,2)
| eval RAG=if(PercentageRamUsed>=75,"YELLOW","GREEN")