Splunk Search

How to delete duplicate logs?

prateedshetty
Path Finder

I've uploaded the same log twice(using drag and drop option in add data) and now when I query I see duplicate results due to this. How do i delete one copy of the log?

0 Karma
1 Solution

jkat54
SplunkTrust
SplunkTrust
 <your search for relevant events> | eval eid=_cd | search [<same search for relevant events> | streamstats count by _raw  | search count>1 | eval eid=_cd | fields eid] | delete

Stolen from here: https://answers.splunk.com/answers/69924/how-to-delete-duplicate-events.html

View solution in original post

mclane1
Path Finder

I use something like this :

<your search>
| streamstats count by _raw
| where count>1
| eval eid=_cd
| table eid
| streamstats count
| outputlookup delete_dupplicate_byeid.csv.gz

You select and number your wanted selection.

After you can delete step by step :

<your search> 
| eval eid=_cd
| search [| inputlookup delete_dupplicate_byeid.csv.gz 
          | eval tmp=0 |  where count>=tmp*10000 AND count<=tmp*10000+9999 
          | fields eid |  format]
| delete

 

0 Karma

woodcock
Esteemed Legend

Like this:

Your Search That Shows Duplicates AND NOTHING ELSE | dedup _raw | delete

You will probably have to extend your user's role/permissions to run the delete command.

kmccririe_splun
Splunk Employee
Splunk Employee

You can use the delete command. https://docs.splunk.com/Documentation/Splunk/6.5.2/SearchReference/Delete

If you can write a search that limits the data to one copy, then you can pipe it to the delete command so you won't see both.

This will only stop you from seeing it, it will not get rid of the data off the indexer.

You can also clean indexes if one of the copies of data is in a different index than the other you can wipe one of the indexes of all the data in there. http://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/RemovedatafromSplunk#Remove_data_from_one_...

0 Karma

Yasaswy
Contributor

There is no log file to delete … your data has been indexed. To get rid of duplicate data from index… you can run a search to identify the duplicate events and pipe it to |delete command.

https://docs.splunk.com/Documentation/Splunk/6.5.2/SearchReference/Delete

0 Karma

jkat54
SplunkTrust
SplunkTrust
 <your search for relevant events> | eval eid=_cd | search [<same search for relevant events> | streamstats count by _raw  | search count>1 | eval eid=_cd | fields eid] | delete

Stolen from here: https://answers.splunk.com/answers/69924/how-to-delete-duplicate-events.html

bestSplunker
Contributor

this search is very slowly

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...