Knowledge Management

Import data without duplicates

alucarddjin
Path Finder

I have a missing set of data. I've been given a new set of data to fill the gaps but there are some duplicates in the raw file to what is already in splunk and I need a way to import the non duplicate data.

So far I've managed to import the new data into a separate index and used a query to remove the items that are already in the main index then tired the collect command to put the values into another index (I've used a dummy one to start with so I don't mess up my main index). However when the data is copied it messes up some of the date formats (turns them to epoch) and doesn't pick up the _time field correctly.

Current code:

    (index=main sourcetype="sourcetype1") OR (index=sourceindex" ) 
    | eventstats count by deviceCustomDate1 fileName 
    | search count=1
    | collect index=sourceindex sourcetype=sourcetype1

Then if I look at the results for one of the record in both indexs I get this

  _time               deviceCustomDate1           index
  2019-05-16 23:47:21   2019/05/16 22:47:21 UTC     sourceindex
  2019-05-17 00:03:29   1558046841000                 destinationindex

Am I missing something? Is Collect the right tool to use?

Thanks in advance.

0 Karma
Get Updates on the Splunk Community!

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...

[Live Demo] Watch SOC transformation in action with the reimagined Splunk Enterprise ...

Overwhelmed SOC? Splunk ES Has Your Back Tool sprawl, alert fatigue, and endless context switching are making ...

What’s New & Next in Splunk SOAR

Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us on ...