Splunk Search

dedup only when values match in two fields

CharterBT
Explorer

Here's an interesting problem. I need to write a query where Splunk removes an event when two specific values in a found event match. For example, a mocked-up sample of my results shows this:

0.0.0.0 test
0.0.0.0 pass
0.0.0.0 pass

I'd like Splunk to only remove the second instance of "0.0.0.0 pass" while keeping the first instance as well as the "0.0.0.0 test" in my results.

Is there an easy way to do this? If it helps, the field name for the numbers is src and for the words is cs5. Any help is appreciated.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Dedup should be able to do this. If you post a little more of your end game, there maybe a more optimized approach. Do you want counts of how many times this happens? etc.

your_search | dedup 2 src

http://docs.splunk.com/Documentation/Splunk/6.0/SearchReference/Dedup

0 Karma

lukejadamec
Super Champion

To find the version, from Splunkweb in the upper right, click About.

0 Karma

CharterBT
Explorer

Not sure the version... but it's not 6, yet. Thanks for the tips. I'll try them and let you know how it goes.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Thats where stats count by cs5 src works a little faster. stats is done at the indexer, dedup is done at the search head. dedup src cs5 should be doing the same thing according to the docs. what version are you using?

CharterBT
Explorer

One other thing. I tried "dedup src, cs5", but it didn't retain any new "src" records after it found its first duplicate src value. I need the dedup to be a little smarter and only remove duplicate entries of the src/cs5 combination.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Then a better search is: your_search | stats dc(cs5) as DistinctInfections by src. This gives you each individual source and how many different infections they have over the time range. If you want how many of each infection per src, do your_search | stats count by cs5 src.

CharterBT
Explorer

No, I don't need to know how many times it repeats.

Each # value is a computer, and each word value is a type of malware. Some computers have multiple infections, so I just need to remove the instances where that computer/malware combination has already been identified. My search is covering a month-long timeframe, so I don't need to count every time it shows up, just that it did at some point.

Does that help?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

How to find the worst searches in your Splunk environment and how to fix them

Everyone knows Splunk is a powerful platform for running searches and doing data analytics. Your ...

Share Your Feedback: On Admin Config Service (ACS)!

Help Us Build a Better Admin Config Service Experience (ACS)   We Want Your Feedback on Admin Config Service ...