Splunk Search

How to find all events where at least one column has duplicate values?

retro-bloke
Explorer

Through a dbx query, I'm pulling out several columns, among which include account_email, true_ip, device_id, and request_id. I want to see only the events where any of these columns have duplicate values. For example, if 3 events have the account_email of test@gmail.com, then I'd want to see all 3 events, regardless of what's in the other columns. The same applies for true_ip, device_id, and request_id. How can I do this?

 

I tried something like this:

dbxquery=query_here | stats count by account_email, true_ip, account_telephone, device_id, request_id, session_id, digital_id | where count > 1

But this doesn't work because this would only extract occurrences where the combination of all those columns happened more than once. But as long as at least one of the columns have a duplicate occurrence, I'd want to see it.

Labels (4)
0 Karma
1 Solution

yeahnah
Motivator

Hi @retro-bloke 

This is a perfect case for eventstats command.  Here's an example of how to do it

| makeresults 
| eval _raw="account_email,true_ip,device_id,request_id
bob@foobar.com,1.1.1.1,abc,123
sid@foobar.com,1.1.1.2,def,456
bob@foobar.com,1.1.1.2,ghi,789
bob@foo.com,1.1.1.3,jkl,999
"
| multikv forceheader=1
``` ^^^ create dummy events ^^^ ```
| eventstats count AS dups:account_email BY account_email
| eventstats count AS dups:true_ip BY true_ip
| eventstats count AS dups:device_id BY device_id
| eventstats count AS dups:request_id BY request_id
| addtotals dups*
| where Total > 4
| table account_email,true_ip,device_id,request_id

 Hope that helps

View solution in original post

yeahnah
Motivator

Hi @retro-bloke 

This is a perfect case for eventstats command.  Here's an example of how to do it

| makeresults 
| eval _raw="account_email,true_ip,device_id,request_id
bob@foobar.com,1.1.1.1,abc,123
sid@foobar.com,1.1.1.2,def,456
bob@foobar.com,1.1.1.2,ghi,789
bob@foo.com,1.1.1.3,jkl,999
"
| multikv forceheader=1
``` ^^^ create dummy events ^^^ ```
| eventstats count AS dups:account_email BY account_email
| eventstats count AS dups:true_ip BY true_ip
| eventstats count AS dups:device_id BY device_id
| eventstats count AS dups:request_id BY request_id
| addtotals dups*
| where Total > 4
| table account_email,true_ip,device_id,request_id

 Hope that helps

retro-bloke
Explorer

Nailed it @yeahnah , thanks!!

 

Next challenge, is it possible to group the table of results by their common field?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...