Solved: Returning Non-Duplicates from a table

brajaram · ‎06-13-2018

My data tells me if counts on a specific server are timing out, and we are trying to set up an alert for when this occurs.

Our query currently is:

index=... responseTime>5000 | bin span=15m _time | stats count by host serviceName _time | search count > 50

This tells us when, for a given timeframe, if a given set of events by any service has more than 5s response time(timeouts). However, what we really want is to identify when this occurs multiple times across a larger time window. Our goal is to have this query run for the last 30 minutes, every 30 minutes, and let us know when a host shows up more than once.

The result tables looks like this:

host serviceName _time count
Host1 Service1 12:00 80
Host1 Service1 12:15 75
Host2 Service1 12:15 55
Host3 Service1 12:00 80
Host3 Service2 12:00 80

What we want is to pull out only the duplicated hosts from this table - that indicates hosts that are timing out across this window, or hosts that are timing out across multiple services, both of which are important to us to identify. The rows I want returned are bolded. How do I remove all values from this table that do not duplicate?

brajaram · ‎06-13-2018

Found the solution. If I append

| eventstats count AS countHost by host | search countHost>1 | fields – countHost
It adds the count of each host name to a new field in the table, which I can then filter on.

View solution in original post

brajaram · ‎06-13-2018

Found the solution. If I append

| eventstats count AS countHost by host | search countHost>1 | fields – countHost
It adds the count of each host name to a new field in the table, which I can then filter on.

cpetterborg · ‎06-13-2018

I'd try to do the simplest method for this which I think would be:

... | stats count as cat by Name, serviceName | search cnt>1

This would take your table and do a count by the Name and serviceName, and then return only those that have more than one instance in the table. You would lose the _time and count, however, in your result.

You may also want to adjust your search so that you have overlapping searches (like every 15 minutes using the last 30 minutes of data). That will prevent lose where you have one on one side of the search period and one on the other.

brajaram · ‎06-13-2018

Very good idea about the overlapping searches, will definitely do that. I do need the _time and count field unfortunately. I did find a solution around it, however, with eventstats

sushantmhatre · ‎06-13-2018

Can you provide more clarification on this question

brajaram · ‎06-13-2018

Essentially, if I have a table that contains 3 elements Name, Value, Count, I want to return only the rows in which Name occurs more than once. Edited the question to be hopefully more clear.

Returning Non-Duplicates from a table

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Splunk Developers: Construct Your Future at the .conf26 Builder Bar

Quick connection discovery mode for forwarders

Build and Launch AI Agents from Your Splunk Workflows

Join the Conversation