Getting Data In

How to exclude duplicate Data while onboaring the data in below scenerio

vikram1583
Explorer

I have a python script with runs daily and saves output in csv file 

for example: if i run that script  today it will get the data from april 1st to today date(04/21/2021) and if i run tomorrow it will get the data from april 1st to tomorrow date (04/22/2021) and with different file name every time we run 

i want to onboard this data into splunk with out duplicate data 

how can we do that? 

we have a field name called start_time   this field we are taking as time field 

for example: start_time field value = 04/21/2021 10.30

example: start_time field value = 04/22/2021 10.30

 

Thanks in advance

 

Labels (1)
0 Karma

venkatasri
SplunkTrust
SplunkTrust

Hi,

Then Splunk avoids re-indexing duplicate data which is built-in, have you configured the monitors then share inputs.conf and sample data files.

 

0 Karma

venkatasri
SplunkTrust
SplunkTrust

Hi @vikram1583 

How the data looks like in both files they change every time script runs? 

Instead index both files and remove duplicates using Splunk commands like - dedup, dc etc... depends on your use case.

----------------------------------------------

An upvote would be appreciated if it helps!

0 Karma

vikram1583
Explorer

Hi @venkatasri  thanks for your response.  its not about only 2 files i will run that script every day if i inject those files everyday license usage will increase so i just want to inject new data 

0 Karma

vikram1583
Explorer

data will be same for previous dates it just adds new data for current date 

0 Karma
Get Updates on the Splunk Community!

New Year, New Changes for Splunk Certifications

As we embrace a new year, we’re making a small but important update to the Splunk Certification ...

[Puzzles] Solve, Learn, Repeat: Unmerging HTML Tables

[Puzzles] Solve, Learn, Repeat: Unmerging HTML TablesFor a previous puzzle, I needed some sample data, and ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...