Monitoring Splunk

Best way to monitor and index millions of files in Splunk

raja21
Explorer

Hi developers, I am trying to analyse some logs by extracting them in JSON format and feeding to splunk.
I have millions of these logs each resulting in a JSON file of 4-5 kb.
How to monitor these files effectively so that spunk picks up each file.

Thanks.

Tags (1)
0 Karma

ddrillic
Ultra Champion

A major issue can be the ulimits for open files. Read please the great post by @yannk at how to tune ulimit on my server ?

0 Karma

FrankVl
Ultra Champion

I see 2 main options:

  1. Put a Universal Forwarder on the system that is storing these logs and create a monitor input for the respective folder.
  2. If you're using some kind of script to extract those logs, you could modify that script to send the JSON data by HTTP POST request to a Splunk Heavy Forwarder / Indexer set up as a HTTP Event Collector: http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC

I don't have experience myself with such huge amounts of files, but unless you get some specific recommendations here, I'd suggest to just give it a try (in a test setup ideally of course) and see what issues you run into. Then you can always post back here to get help resolving those issues.

0 Karma

raja21
Explorer

hi @FrankVl, I tried HTTP Event Collector method and found it to be useful.

Now the issue is i have to run curl command for each files. On a daily basis i get millions of files to process so would it be an overhead to run curl so many times?

I also have an idea of merging all the JSON records into one file seperated by [EOF] and send that file across to splunk and break events using [EOF].
But its not getting inputted into splunk as [EOF] is not in JSON format.

Any other solutions??

0 Karma

FrankVl
Ultra Champion

Don't think curl should give too much overhead, but you should be able to see that for yourself whether it causes problematic load.

As per your other idea: I don't completely follow what you tried and what is failing.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...