Here's an interesting problem. I need to write a query where Splunk removes an event when two specific values in a found event match. For example, a mocked-up sample of my results shows this:
0.0.0.0       test
0.0.0.0       pass
0.0.0.0       pass
I'd like Splunk to only remove the second instance of "0.0.0.0 pass" while keeping the first instance as well as the "0.0.0.0 test" in my results.
Is there an easy way to do this? If it helps, the field name for the numbers is src and for the words is cs5. Any help is appreciated.
 
		
		
		
		
		
	
			
		
		
			
					
		Dedup should be able to do this. If you post a little more of your end game, there maybe a more optimized approach. Do you want counts of how many times this happens? etc.
your_search | dedup 2 src
http://docs.splunk.com/Documentation/Splunk/6.0/SearchReference/Dedup
To find the version, from Splunkweb in the upper right, click About.
Not sure the version... but it's not 6, yet. Thanks for the tips. I'll try them and let you know how it goes.
 
		
		
		
		
		
	
			
		
		
			
					
		Thats where stats count by cs5 src works a little faster. stats is done at the indexer, dedup is done at the search head. dedup src cs5 should be doing the same thing according to the docs. what version are you using?
One other thing. I tried "dedup src, cs5", but it didn't retain any new "src" records after it found its first duplicate src value. I need the dedup to be a little smarter and only remove duplicate entries of the src/cs5 combination.
 
		
		
		
		
		
	
			
		
		
			
					
		Then a better search is: your_search | stats dc(cs5) as DistinctInfections by src. This gives you each individual source and how many different infections they have over the time range.  If you want how many of each infection per src, do your_search | stats count by cs5 src.
No, I don't need to know how many times it repeats.
Each # value is a computer, and each word value is a type of malware. Some computers have multiple infections, so I just need to remove the instances where that computer/malware combination has already been identified. My search is covering a month-long timeframe, so I don't need to count every time it shows up, just that it did at some point.
Does that help?
