Getting Data In

How analyse the latest version of a growing csv?

HeinzWaescher
Motivator

Hi,

I want to import a growing .csv every week, so there will be duplicate events. In the report I only want to analyse the latest version of the csv/the latest dataset.
My first thought is to filter the latest indextime

my base search
| eventstats max(_indextime) AS max_indextime
| where _indextime=max_indextime

But I'm not sure whether the imported events will always have the same indextime per import. Or can the indextime vary for large csv files?

Thanks in advance

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

View solution in original post

0 Karma

niketn
Legend

Can you share header and event for your CSV file? Also when the CSV file grows over time, does the filename(source) change?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...