I have a missing set of data. I've been given a new set of data to fill the gaps but there are some duplicates in the raw file to what is already in splunk and I need a way to import the non duplicate data.
So far I've managed to import the new data into a separate index and used a query to remove the items that are already in the main index then tired the collect command to put the values into another index (I've used a dummy one to start with so I don't mess up my main index). However when the data is copied it messes up some of the date formats (turns them to epoch) and doesn't pick up the _time field correctly.
Current code:
(index=main sourcetype="sourcetype1") OR (index=sourceindex" )
| eventstats count by deviceCustomDate1 fileName
| search count=1
| collect index=sourceindex sourcetype=sourcetype1
Then if I look at the results for one of the record in both indexs I get this
_time deviceCustomDate1 index
2019-05-16 23:47:21 2019/05/16 22:47:21 UTC sourceindex
2019-05-17 00:03:29 1558046841000 destinationindex
Am I missing something? Is Collect the right tool to use?
Thanks in advance.