Splunk Search

How to display certain stats values command in a search?

jip31
Motivator

Hello

I use the stats command below
but some process_name have no process_cpu_used_percent value
So how to do for displaying in my stats values command only the process_name which have a process_cpu_used_percent?
thanks in advance

| stats latest(time) as time, values(COUNTRY) as COUNTRY, values(TOWN) as TOWN, values(SITE) as SITE, values(ROOM) as ROOM, values(process_name) as process_name, values(process_cpu_used_percent) as process_cpu_used_percent by host
Tags (2)
0 Karma
1 Solution

FrankVl
Ultra Champion

Filtering the results for only those containing process_cpu_used_percent values as @harsmarvania57 suggests would be a good start. There is still another issue with your approach though. values() returns the values of that field in lexicographic order, which means, that you loose the correlation between process_name and the process_cpu_used_percent. You get a list of process names and a list of cpu percentages, but have no way of telling which belongs to which.

The following would be a better way to get the latest process_cpu_used_percent value for each process on each host.

<yourBaseSearch>
 | where isnotnull(process_cpu_used_percent)
 | stats latest(time) as time, values(COUNTRY) as COUNTRY, values(TOWN) as TOWN, values(SITE) as SITE, values(ROOM) as ROOM,  latest(process_cpu_used_percent) as process_cpu_used_percent by host,process_name

Note: I moved the process_name to the by clause and changed values(process_cpu_used_percent) to latest(process_cpu_used_percent).

Instead of doing | where isnotnull(process_cpu_used_percent) you can also simply add process_cpu_used_percent=* to your initial search (assuming this is a field that is present in your events and not the result of some intermediate calculation).

View solution in original post

0 Karma

FrankVl
Ultra Champion

Filtering the results for only those containing process_cpu_used_percent values as @harsmarvania57 suggests would be a good start. There is still another issue with your approach though. values() returns the values of that field in lexicographic order, which means, that you loose the correlation between process_name and the process_cpu_used_percent. You get a list of process names and a list of cpu percentages, but have no way of telling which belongs to which.

The following would be a better way to get the latest process_cpu_used_percent value for each process on each host.

<yourBaseSearch>
 | where isnotnull(process_cpu_used_percent)
 | stats latest(time) as time, values(COUNTRY) as COUNTRY, values(TOWN) as TOWN, values(SITE) as SITE, values(ROOM) as ROOM,  latest(process_cpu_used_percent) as process_cpu_used_percent by host,process_name

Note: I moved the process_name to the by clause and changed values(process_cpu_used_percent) to latest(process_cpu_used_percent).

Instead of doing | where isnotnull(process_cpu_used_percent) you can also simply add process_cpu_used_percent=* to your initial search (assuming this is a field that is present in your events and not the result of some intermediate calculation).

0 Karma

jip31
Motivator

i candoing | where process_cpu_used_percent=* because I already doing | where process_cpu_used_percent>80
you code works fine even if it would be better for me to have in a same line all the process where process_cpu_used_percent has a value by host 😉

0 Karma

FrankVl
Ultra Champion

Yeah, as mentioned in one of my other comments: filtering for process_cpu_used_percent is not needed if you already do | where process_cpu_used_percent>80. But you can simply move that to your initial search instead of a separate where command.

But take a look at my other comments as well, because your approach (especially the dedup) still seems weird.

There are ways to get it on a single line, but using values() is not the best way, as (like I mentioned) you loose track of which percentage was for which process.

0 Karma

jip31
Motivator

sorry I have a lot of misundestanding because the language
ok for dedup for the rest I do a synthesis :

In my initial dashboard I have now :

[| inputlookup host.csv 
    | table host] index="ai-wkst-perfmon-fr" sourcetype="perfmonmk:process" 
| bucket _time span=3m 
| where process_cpu_used_percent>80 

| lookup lookup_cmdb_fo_all.csv HOSTNAME as host output SITE 
| search SITE=$tok_filtersite|s$ 
| stats count(process_name) as Total by host
| sort -Total limit=10

In the drilldown I have :

[| inputlookup host.csv 
    | table host] index="ai-wkst-perfmon-fr" sourcetype="perfmonmk:process" 
| where process_cpu_used_percent>80 
| lookup lookup_cmdb_fo_all.csv HOSTNAME as host output SITE COUNTRY TOWN ROOM 
| where SITE=$SITE$ 
| eval time = strftime(_time, "%m/%d/%Y %H:%M") 
| eval process_cpu_used_percent=round(process_cpu_used_percent,2). " %" 
| stats latest(time) as time, values(COUNTRY) as COUNTRY, values(TOWN) as TOWN, values(SITE) as SITE, values(ROOM) as ROOM, latest(process_cpu_used_percent) as process_cpu_used_percent by host process_name
| table time host COUNTRY TOWN SITE ROOM process_name process_cpu_used_percent

is there still weird things??
I also doesnt understand to things : I havent the same number of events in the 2 searches and why I am obliget to used also | where process_cpu_used_percent>80 in my drilldown?
Normally the data have been already filtered in the dashboard source no??

0 Karma

FrankVl
Ultra Champion

You mean the Total in the dashboard is larger than when you manually count the number of process names listed for a certain host in your drilldown search?

That makes sense, as you do a count(process_name), which simply counts the number of events with a value in the process_name field. It doesn't count unique process names. If you want to count unique process names by host, you need to use dc(process_name).

A drilldown is just a new search ran on its own (but possibly parameterized by values from your dashboard).

0 Karma

jip31
Motivator

thanks you are the best
and sorry for all my questions but i am rookie have never been teached and have no support around me....

0 Karma

harsmarvania57
Ultra Champion

Hi,

Please try below query

<yourBaseSearch>
| where isnotnull(process_cpu_used_percent)
| stats latest(time) as time, values(COUNTRY) as COUNTRY, values(TOWN) as TOWN, values(SITE) as SITE, values(ROOM) as ROOM, values(process_name) as process_name, values(process_cpu_used_percent) as process_cpu_used_percent by host
0 Karma

jip31
Motivator

hi
it doesnt works

[| inputlookup host.csv 
    | table host] index="x" sourcetype="perfmonmk:process" 
| where process_cpu_used_percent>80 
| lookup x.csv HOSTNAME as host output SITE COUNTRY TOWN ROOM 
| where SITE=$SITE$ 
| eval time = strftime(_time, "%m/%d/%Y %H:%M") 
| dedup process_name 
| eval process_cpu_used_percent=round(process_cpu_used_percent,2). " %" 
| where isnotnull(process_cpu_used_percent) 
| stats latest(time) as time, values(COUNTRY) as COUNTRY, values(TOWN) as TOWN, values(SITE) as SITE, values(ROOM) as ROOM, values(process_name) as process_name, values(process_cpu_used_percent) as process_cpu_used_percent by host
0 Karma

harsmarvania57
Ultra Champion

As you are using | where process_cpu_used_percent>80 initially, I don't think | where isnotnull(process_cpu_used_percent) is necessary in this case because you are already filtering process_cpu_used_percent initially with values greater 80.

We would like to require some rawdata to test this (please mask any sensitive data).

0 Karma

FrankVl
Ultra Champion

| where process_cpu_used_percent>80 already returns only events with this field (and additionally dropping any of them where it is below 80). So I don't see how this can result in output with empty process_cpu_used_percent values???

But perhaps take a look at my answer below for improving your stats command in general, as your current approach is flawed.

0 Karma

FrankVl
Ultra Champion

Also: doing a dedup on only process_name doesn't make much sense if you want to get results for each host. I guess you will want to do | dedup process_name host. Then again: if you have already done that, there is no point in doing a stats like that. As you already have the latest line for each host,process_name pair.

0 Karma
Get Updates on the Splunk Community!

Good Sourcetype Naming

When it comes to getting data in, one of the earliest decisions made is what to use as a sourcetype. Often, ...

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...

Splunk App for Anomaly Detection End of Life Announcement

Q: What is happening to the Splunk App for Anomaly Detection?A: Splunk is officially announcing the ...