Getting Data In

Routing events is creating bottleneck for ingestions- How do I resolve this issue?

bsrikanthreddy5
Path Finder

On HF we have routing summaries in transforms.conf which are take more time and creating a bottleneck for us
We have below number of routing summaries
~2000 entries for index routing
~200 entries for sourcetype routing

Can you please provide suggestions to route the events  faster and efficiently.


Sample from transform.conf.

[route_sentinel_to_index]
INGEST_EVAL = index:=case(\
match(_raw, "\"TENANT\":\"xxxxxx-b589-c11a968d4876\""), "nacoak_mil", \
.
.
.<1997 entries>
.
.
match(_raw, "\"EVENT_TIME\":\"\d{13}\""), "unknown_events", \
true(), "unknownsentinel")


[apply_sourcetype_to_sentinel]
INGEST_EVAL = sourcetype:=case(\
match(_raw, "\"SYSTEM\":\"xxxx-b3a7-xxxxxx\""), "cs:fhir:prod:audit:json", \
match(_raw, "\"SYSTEM\":\"xxxxxxx-d424c20xxxx\""), "cs:railsapp_server:ambulatory:audit:json", \
.
.<198 entries>
.
true(), "cs:sentinel:unknown:audit:json")

Tags (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Why such large case statements?  Can the assignment of index and sourcetype be moved to inputs.conf?

If not, then can you add more HFs and partition the input among them?

---
If this reply helps you, Karma would be appreciated.

bsrikanthreddy5
Path Finder

This is way it written before to handle few case statements , but the list keep on growing.
we have set up batch to read the files on UF and each event is getting evaluated with case to ingest to index with right sourcetype.
do you mean moving this inputs.conf on UF and does not make huge inputs.conf if we move ?

we have 5 UF and 2 HF , I have seen the issue added 2 HF , so now 5 UF and 4 HF even then it not ingesting fast enough.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Having UFs specify the index and sourcetype of each input is standard practice.  Will it make for huge inputs.conf files?  Maybe.  I don't know what your current inputs.conf looks like, but there's little harm in having large ones.  The UF will be monitoring the same files, anyway, only now a lot of work will be shifted off the HFs.

---
If this reply helps you, Karma would be appreciated.
0 Karma

bsrikanthreddy5
Path Finder

Below is inputs.conf on UF
[batch:///flume/rollingfiles/process]
_TCP_ROUTING = prod-a_hf
crcSalt = <SOURCE>
disabled = false
index = unknownsentinel
move_policy = sinkhole
recursive = false
sourcetype = cs:sentinel:unknown:audit:json
whitelist = \.rd$

UF  will do any transformation of events, so it has to go through the HF / Indexer, which would be same issue  we are seeing now right .  Please let me know If am missing anything

I way its setup now is
 UF ---> HF ---> Indexer cluster

UF --> batch monitor to to read the files with multiple events
HF  --> to route each event based of tenant Id to specific index and sourcetype

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Hmm...  So all monitored files are in one place and it's up to the HF to figure out where everything goes?  That's a non-scalable solution.

Is there a faster way to decide where the data goes?  Perhaps by examining the host or source rather than the contents?

If that won't work and you don't want to add more HFs (maybe 4 isn't enough for the job) then consider installing Cribl.

---
If this reply helps you, Karma would be appreciated.

bsrikanthreddy5
Path Finder

Yes, all data is  one place, HF does routing . 
We are trying see if moving the routing  to indexer helps and also we are trying to find a scalable solution.
The only way as of now is to view contents. I will add couple HF see if that helps. Thank you!

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...