Getting Data In

Is there a way to prioritize inputs?

pdominicb
Explorer

I am about to have a few UFs monitoring some extremely high volume logs. These high volume logs are less critical than some of the current low volume logs we're already monitoring. Its acceptable that the new high volume logs are delayed, but we need the current critical ones in (near) real-time as possible. 

We're already looking at setting maxkbps=0 or increasing concurrent pipelines, but we have concerns on resource consumption. We'd rather not add extra CPUs just for logging. 

So, I am wondering if there is anyway to set some inputs to be a higher priority than others. A few ideas I had are :

  1. Use TCPOUT routing and set the maxkbps per destination. But maxkbps is global, so that wont work.
  2. Raise concurrent pipelines on the UF and prioritize each pipeline somehow. For example, one pipeline is guaranteed 80% of the load, while another pipeline is only allowed 20% of the load. Then specify the pipeline to use per input. But there doesn't seem to be a way to say one pipeline is prioritized over another. 
  3. Install two UFs on the servers. Port conflicts... seems horrible. 

Any ideas here?

Labels (2)
Tags (3)
0 Karma

jo3ccitovvm
New Member
This is such a common problem with high-volume logging — we’ve dealt with the exact same thing!
 
What’s worked for us is using two separate inputs.conf stanzas with different queue and pipeline settings. We assign the critical logs to a pipeline with higher priority and larger queueSize, while the high-volume non-critical logs go to a separate pipeline with lower concurrency and a smaller queue. It keeps the critical stuff moving in near-real-time without starving the system.
 
Also, using maxKBps on the high-volume inputs to cap their throughput really helps prevent them from overwhelming the forwarder.
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Sorry to say that but I have no idea what LLM you pulled this one from.

Splunk has no way of "assigining logs to a specific pipeline". There are no specific settings per pipeline if you have more than one and there is no maxKbps setting at inputs level. There is just one global throughput maxKbps setting set in limits.conf.

 

PickleRick
SplunkTrust
SplunkTrust

Generally speaking - no. There is no way to prioritize inputs. And yes, it can have an impact on UFs sometimes. I've had a strange setup with a UF checking huge number of  files from network shares. Every time the UF was restarted it would need about an hour to catch up with the states of all the monitored files. As far as I remember it even lagged ingestion of forwarder's internal events. That was very wrong and luckily has been fixed since. But it shows that you can't prioritize inputs versus each other.

 

isoutamo
SplunkTrust
SplunkTrust

If I have understand correctly how splunk UF is doing this and you have only those two high volume logs then (probably) you can try to prioritise those by adding pipelines. I'm not 100% sure this, but my understanding is that in normal situation on files will be read to the end and then UF switch to another. If this is true then those high volume files could have dedicated readers and other shares additional. But remember that you cannot add too many pipelines per node!

And as I said I haven't been to test this in real life by myself. There have been cases where I have added some pipelines e.g. in HF (DBX) to get it working correctly. And I expecting that this is working UF too?

I don't believe that this is "approved by Splunk" solution 😉

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I haven't touched it in ages but I was pretty sure the inputs were independent on the processing queues.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Yes those are independent, but as one file is actively read until all event are handled then one big active file don't block the traffic as there are another pipelines available to process other files instead of waiting when that big high volume file has read.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not sure it works like that. If inputs were tied to the particular queues, scaling processing pipelines would have no effect on single input. And as far as I remember, I had multi-pipeline UFs with lagging inputs and it wouldn't help much.

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @pdominicb 

The only thing that comes to my mind is the maxkbps limits.conf setting which you've mentioned too, and yes this is global therefore I think the only way you could control the limit per input is to run two UF on the same server. This is possible but you would need to update the clashing ports, this shouldnt be too much of a big deal as the UF will only listen on port 8089 (mgmt) plus any input ports configured, so you could set your second UF installation to listen on port 8090 (for example).

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...