Splunk Search

How to delete duplicate logs?

prateedshetty
Path Finder

I've uploaded the same log twice(using drag and drop option in add data) and now when I query I see duplicate results due to this. How do i delete one copy of the log?

0 Karma
1 Solution

jkat54
SplunkTrust
SplunkTrust
 <your search for relevant events> | eval eid=_cd | search [<same search for relevant events> | streamstats count by _raw  | search count>1 | eval eid=_cd | fields eid] | delete

Stolen from here: https://answers.splunk.com/answers/69924/how-to-delete-duplicate-events.html

View solution in original post

mclane1
Path Finder

I use something like this :

<your search>
| streamstats count by _raw
| where count>1
| eval eid=_cd
| table eid
| streamstats count
| outputlookup delete_dupplicate_byeid.csv.gz

You select and number your wanted selection.

After you can delete step by step :

<your search> 
| eval eid=_cd
| search [| inputlookup delete_dupplicate_byeid.csv.gz 
          | eval tmp=0 |  where count>=tmp*10000 AND count<=tmp*10000+9999 
          | fields eid |  format]
| delete

 

0 Karma

woodcock
Esteemed Legend

Like this:

Your Search That Shows Duplicates AND NOTHING ELSE | dedup _raw | delete

You will probably have to extend your user's role/permissions to run the delete command.

kmccririe_splun
Splunk Employee
Splunk Employee

You can use the delete command. https://docs.splunk.com/Documentation/Splunk/6.5.2/SearchReference/Delete

If you can write a search that limits the data to one copy, then you can pipe it to the delete command so you won't see both.

This will only stop you from seeing it, it will not get rid of the data off the indexer.

You can also clean indexes if one of the copies of data is in a different index than the other you can wipe one of the indexes of all the data in there. http://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/RemovedatafromSplunk#Remove_data_from_one_...

0 Karma

Yasaswy
Contributor

There is no log file to delete … your data has been indexed. To get rid of duplicate data from index… you can run a search to identify the duplicate events and pipe it to |delete command.

https://docs.splunk.com/Documentation/Splunk/6.5.2/SearchReference/Delete

0 Karma

jkat54
SplunkTrust
SplunkTrust
 <your search for relevant events> | eval eid=_cd | search [<same search for relevant events> | streamstats count by _raw  | search count>1 | eval eid=_cd | fields eid] | delete

Stolen from here: https://answers.splunk.com/answers/69924/how-to-delete-duplicate-events.html

bestSplunker
Contributor

this search is very slowly

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...