Solved: Re: Returning Non-Duplicates from a table

brajaram · ‎06-13-2018

My data tells me if counts on a specific server are timing out, and we are trying to set up an alert for when this occurs.

Our query currently is:

index=... responseTime>5000 | bin span=15m _time | stats count by host serviceName _time | search count > 50

This tells us when, for a given timeframe, if a given set of events by any service has more than 5s response time(timeouts). However, what we really want is to identify when this occurs multiple times across a larger time window. Our goal is to have this query run for the last 30 minutes, every 30 minutes, and let us know when a host shows up more than once.

The result tables looks like this:

host serviceName _time count
Host1 Service1 12:00 80
Host1 Service1 12:15 75
Host2 Service1 12:15 55
Host3 Service1 12:00 80
Host3 Service2 12:00 80

What we want is to pull out only the duplicated hosts from this table - that indicates hosts that are timing out across this window, or hosts that are timing out across multiple services, both of which are important to us to identify. The rows I want returned are bolded. How do I remove all values from this table that do not duplicate?

brajaram · ‎06-13-2018

Found the solution. If I append

| eventstats count AS countHost by host | search countHost>1 | fields – countHost
It adds the count of each host name to a new field in the table, which I can then filter on.

View solution in original post

brajaram · ‎06-13-2018

Found the solution. If I append

| eventstats count AS countHost by host | search countHost>1 | fields – countHost
It adds the count of each host name to a new field in the table, which I can then filter on.

cpetterborg · ‎06-13-2018

I'd try to do the simplest method for this which I think would be:

... | stats count as cat by Name, serviceName | search cnt>1

This would take your table and do a count by the Name and serviceName, and then return only those that have more than one instance in the table. You would lose the _time and count, however, in your result.

You may also want to adjust your search so that you have overlapping searches (like every 15 minutes using the last 30 minutes of data). That will prevent lose where you have one on one side of the search period and one on the other.

brajaram · ‎06-13-2018

Very good idea about the overlapping searches, will definitely do that. I do need the _time and count field unfortunately. I did find a solution around it, however, with eventstats

sushantmhatre · ‎06-13-2018

Can you provide more clarification on this question

brajaram · ‎06-13-2018

Essentially, if I have a table that contains 3 elements Name, Value, Count, I want to return only the rows in which Name occurs more than once. Edited the question to be hopefully more clear.

Returning Non-Duplicates from a table

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

Returning Non-Duplicates from a table

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard