Getting Data In

How analyse the latest version of a growing csv?

HeinzWaescher
Motivator

Hi,

I want to import a growing .csv every week, so there will be duplicate events. In the report I only want to analyse the latest version of the csv/the latest dataset.
My first thought is to filter the latest indextime

my base search
| eventstats max(_indextime) AS max_indextime
| where _indextime=max_indextime

But I'm not sure whether the imported events will always have the same indextime per import. Or can the indextime vary for large csv files?

Thanks in advance

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

View solution in original post

0 Karma

niketn
Legend

Can you share header and event for your CSV file? Also when the CSV file grows over time, does the filename(source) change?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...