Knowledge Management

Import data without duplicates

alucarddjin
Path Finder

I have a missing set of data. I've been given a new set of data to fill the gaps but there are some duplicates in the raw file to what is already in splunk and I need a way to import the non duplicate data.

So far I've managed to import the new data into a separate index and used a query to remove the items that are already in the main index then tired the collect command to put the values into another index (I've used a dummy one to start with so I don't mess up my main index). However when the data is copied it messes up some of the date formats (turns them to epoch) and doesn't pick up the _time field correctly.

Current code:

    (index=main sourcetype="sourcetype1") OR (index=sourceindex" ) 
    | eventstats count by deviceCustomDate1 fileName 
    | search count=1
    | collect index=sourceindex sourcetype=sourcetype1

Then if I look at the results for one of the record in both indexs I get this

  _time               deviceCustomDate1           index
  2019-05-16 23:47:21   2019/05/16 22:47:21 UTC     sourceindex
  2019-05-17 00:03:29   1558046841000                 destinationindex

Am I missing something? Is Collect the right tool to use?

Thanks in advance.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...