Reporting

Running outputcsv without append option

keerthana_k
Communicator

Hi

We are running an outputcsv command in hourly intervals through a python script. We have not mentioned append option in the query. I would like to know what should be the expected behavior of Splunk. Will the csv file be overwritten every hour? Will the headers alone be retained? Please clarify.

Thanks in Advance.

Tags (1)
1 Solution

sideview
SplunkTrust
SplunkTrust

The given csv will be overwritten every time outputcsv runs, headers and all. the new headers will simply match the fields in the new results.

It is the same for the outputlookup command.

Also, in a lot of real-world use cases, using the append flag on the outputcsv command itself can result in a lot of duplicates. As a result it can be better to do the appending separately, along with a little search language to remove the duplicates as appropriate.

Here is a simple example, where the csv has a primary key called 'user', and the csv is just mapping each user to a "group" field.

<search terms to get the "new" rows mapping users to groups> | stats last(group) as group by user | append [| inputcsv mycsv] | stats first(group) as group by user | outputcsv mycsv

As you can see, each time the search runs it will update the corresponding users that have changed, but not duplicate them. Obviously it's a very simple example with only two fields, but with a little more attention to the stats commands you can use the same technique. Note that it's better to put the inputcsv command in the append; if you put the actual search in the append you may increase the chances of hitting limits in append concerning execution time or number of rows.

View solution in original post

sideview
SplunkTrust
SplunkTrust

The given csv will be overwritten every time outputcsv runs, headers and all. the new headers will simply match the fields in the new results.

It is the same for the outputlookup command.

Also, in a lot of real-world use cases, using the append flag on the outputcsv command itself can result in a lot of duplicates. As a result it can be better to do the appending separately, along with a little search language to remove the duplicates as appropriate.

Here is a simple example, where the csv has a primary key called 'user', and the csv is just mapping each user to a "group" field.

<search terms to get the "new" rows mapping users to groups> | stats last(group) as group by user | append [| inputcsv mycsv] | stats first(group) as group by user | outputcsv mycsv

As you can see, each time the search runs it will update the corresponding users that have changed, but not duplicate them. Obviously it's a very simple example with only two fields, but with a little more attention to the stats commands you can use the same technique. Note that it's better to put the inputcsv command in the append; if you put the actual search in the append you may increase the chances of hitting limits in append concerning execution time or number of rows.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

How to find the worst searches in your Splunk environment and how to fix them

Everyone knows Splunk is a powerful platform for running searches and doing data analytics. Your ...

Share Your Feedback: On Admin Config Service (ACS)!

Help Us Build a Better Admin Config Service Experience (ACS)   We Want Your Feedback on Admin Config Service ...

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...