Getting Data In

job based log file processing

levinro
Engager

Newbie here - Just evaluating Splunk.

I set up my source to watch a directory and my source type filtering by file name pattern. This is all great, it parsed out errors & messages nicely with no effort.

Now I want to take it a step further. When each file arrives and is closed (the discrete job finished), I want to evaluate things like run time (time between last & first time stamp in the log file) and record counts (a string pulled from a specific message in the log file). Then I want to create trending reports on these values on a daily/weekly/monthly basis.

Tags (4)

jtacy
Builder

Using metadata is probably the fastest way to get the start/end timestamps for each source. You can then get the difference in seconds pretty easily. You'll need to have some kind of search that returns the record count for each file and that might involve using rex to pull that field out. That field extraction should happen in the subsearch part of the example below:

| metadata type=sources index=yourindex | eval duration=lastTime-firstTime | join source [search index=yourindex recordcount=* | top 1 recordcount by source] | table source, duration, recordcount

If you're going to have a lot of events, optimizing that subsearch will be crucial when you start getting into those weekly/monthly groupings. There will be some stuff to add to turn this into a chart, but I'd probably start with the field extraction and getting the duration and record count into a table and go from there. Good luck!

0 Karma

lukejadamec
Super Champion

Individual log files are generally considered to be a Source, not sure if you can do a transaction on a Source.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Thanks for the Memories! Splunk University, .conf25, and our Community

Thank you to everyone in the Splunk Community who joined us for .conf25, which kicked off with our iconic ...

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Introducing Splunk 10.0: Smarter, Faster, and More Powerful Than Ever

Now On Demand Whether you're managing complex deployments or looking to future-proof your data ...