Getting Data In

How analyse the latest version of a growing csv?

HeinzWaescher
Motivator

Hi,

I want to import a growing .csv every week, so there will be duplicate events. In the report I only want to analyse the latest version of the csv/the latest dataset.
My first thought is to filter the latest indextime

my base search
| eventstats max(_indextime) AS max_indextime
| where _indextime=max_indextime

But I'm not sure whether the imported events will always have the same indextime per import. Or can the indextime vary for large csv files?

Thanks in advance

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

View solution in original post

0 Karma

niketn
Legend

Can you share header and event for your CSV file? Also when the CSV file grows over time, does the filename(source) change?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...

Auto-Injector for Everything Else: Making OpenTelemetry Truly Universal

You might have seen Splunk’s recent announcement about donating the OpenTelemetry Injector to the ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...