Reporting

Running outputcsv without append option

keerthana_k
Communicator

Hi

We are running an outputcsv command in hourly intervals through a python script. We have not mentioned append option in the query. I would like to know what should be the expected behavior of Splunk. Will the csv file be overwritten every hour? Will the headers alone be retained? Please clarify.

Thanks in Advance.

Tags (1)
1 Solution

sideview
SplunkTrust
SplunkTrust

The given csv will be overwritten every time outputcsv runs, headers and all. the new headers will simply match the fields in the new results.

It is the same for the outputlookup command.

Also, in a lot of real-world use cases, using the append flag on the outputcsv command itself can result in a lot of duplicates. As a result it can be better to do the appending separately, along with a little search language to remove the duplicates as appropriate.

Here is a simple example, where the csv has a primary key called 'user', and the csv is just mapping each user to a "group" field.

<search terms to get the "new" rows mapping users to groups> | stats last(group) as group by user | append [| inputcsv mycsv] | stats first(group) as group by user | outputcsv mycsv

As you can see, each time the search runs it will update the corresponding users that have changed, but not duplicate them. Obviously it's a very simple example with only two fields, but with a little more attention to the stats commands you can use the same technique. Note that it's better to put the inputcsv command in the append; if you put the actual search in the append you may increase the chances of hitting limits in append concerning execution time or number of rows.

View solution in original post

sideview
SplunkTrust
SplunkTrust

The given csv will be overwritten every time outputcsv runs, headers and all. the new headers will simply match the fields in the new results.

It is the same for the outputlookup command.

Also, in a lot of real-world use cases, using the append flag on the outputcsv command itself can result in a lot of duplicates. As a result it can be better to do the appending separately, along with a little search language to remove the duplicates as appropriate.

Here is a simple example, where the csv has a primary key called 'user', and the csv is just mapping each user to a "group" field.

<search terms to get the "new" rows mapping users to groups> | stats last(group) as group by user | append [| inputcsv mycsv] | stats first(group) as group by user | outputcsv mycsv

As you can see, each time the search runs it will update the corresponding users that have changed, but not duplicate them. Obviously it's a very simple example with only two fields, but with a little more attention to the stats commands you can use the same technique. Note that it's better to put the inputcsv command in the append; if you put the actual search in the append you may increase the chances of hitting limits in append concerning execution time or number of rows.

Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...