Knowledge Management

Import data without duplicates

alucarddjin
Path Finder

I have a missing set of data. I've been given a new set of data to fill the gaps but there are some duplicates in the raw file to what is already in splunk and I need a way to import the non duplicate data.

So far I've managed to import the new data into a separate index and used a query to remove the items that are already in the main index then tired the collect command to put the values into another index (I've used a dummy one to start with so I don't mess up my main index). However when the data is copied it messes up some of the date formats (turns them to epoch) and doesn't pick up the _time field correctly.

Current code:

    (index=main sourcetype="sourcetype1") OR (index=sourceindex" ) 
    | eventstats count by deviceCustomDate1 fileName 
    | search count=1
    | collect index=sourceindex sourcetype=sourcetype1

Then if I look at the results for one of the record in both indexs I get this

  _time               deviceCustomDate1           index
  2019-05-16 23:47:21   2019/05/16 22:47:21 UTC     sourceindex
  2019-05-17 00:03:29   1558046841000                 destinationindex

Am I missing something? Is Collect the right tool to use?

Thanks in advance.

0 Karma
Get Updates on the Splunk Community!

Unlock Database Monitoring with Splunk Observability Cloud

  In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and ...

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

At Cisco, purpose isn’t a tagline—it’s a commitment. Cisco’s FY25 Purpose Report outlines how the company is ...

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join us for a live Demo Day at the Cisco Store on January 21st 10:00am - 11:00am PST In the fast-paced world ...