Getting Data In

Exporting only dedup'd entries?

Davvvem
Engager

Hi All,

I've searched quite a lot but cant find a good method to get this workflow to work.

I've got a python script in splunk which returns a JSON and a dashboard which tables the results.

The script will import new entries when it is run daily.

I want to export the new table values as a .csv from Splunk and ensure I'm not exporting duplicates or entries that I've exported previously.

At the moment my thought process is that if I can tag entries with an import date I can filter out previous days imports.

Is there documentation or suggestions on how I can have new entries dedup'd and then only export new and unique entries?

0 Karma

skalliger
Motivator

Here's what I would suggest.

Run your query once with the outputcsv command like you want to save your data. Now modify your search to make an input lookup with inputcsv on that csv file. Get your data in, do a dedup on the specified fields and after that, you're safe to do your outputcsv again. You can exclude unnecessary fields like mentioned in the outputcsv documentation.

If you need further assistance I'd need an example of your search.

Skall

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

Data Management Digest – May 2026

Welcome to the May 2026 edition of Data Management Digest!   As your trusted partner in data innovation, the ...