Getting Data In

How analyse the latest version of a growing csv?

HeinzWaescher
Motivator

Hi,

I want to import a growing .csv every week, so there will be duplicate events. In the report I only want to analyse the latest version of the csv/the latest dataset.
My first thought is to filter the latest indextime

my base search
| eventstats max(_indextime) AS max_indextime
| where _indextime=max_indextime

But I'm not sure whether the imported events will always have the same indextime per import. Or can the indextime vary for large csv files?

Thanks in advance

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

View solution in original post

0 Karma

niketn
Legend

Can you share header and event for your CSV file? Also when the CSV file grows over time, does the filename(source) change?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi HeinzWaescher,
let me better understand:
you import events from a csv every period (e.g. one day) in an index and then you need to use the latest imported version, is this correct'?
You could:

  • create an empty lookup called e.g. my_lookup.csv;
  • index the new version of csv using the current time;
  • then schedule (e.g. after one hour) a search like the following to populate the lookup to use in the following searches: index=my_index sourcetype=my_sourcetype earlieat=-2h latest=now | table field1, field2, .... | outputlookup my_lookup.csv

In this way you have only the latest information you need.

Bye.
Giuseppe

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...