Filtering logs at HF to get a single event of each...

hectorvp · ‎11-04-2020

Hello Splunkers,

I need to filter logs at HF to send only single log from each source from every host once in a day to the indexer A. And all the logs will be forwarded to indexer B where indexer B is a customer indexers. And hence we won't have access to indexer B.

Indexer A is what we own, we need to use it for logs validation whether any log of certain appn is showing up or not at every day. This is part of our logs generating validation that we are asked by our customers.

richgalloway · ‎11-04-2020

I don't know how to skim a single event from each source once each day.

Another way to monitor data sources is to watch the metrics reported to the _internal index (to which you should have access if you're to properly monitor your customer's Splunk).

---
If this reply helps you, Karma would be appreciated.

hectorvp · ‎11-04-2020

Hi @richgalloway ,

Yes we are fetching internal logs of UF.

But suppose in windows app logs if multiple sources are logging for ex: app A, app B and app C.

Can we validate that app A just sent the log to customer indexer from internal index?

Need to validate this at per day level that is atleast a single event was send from appA in a day, if not then it fire an alert.

Will this be possible from metrics.log?

I read it send info only regarding top 10 source types.

richgalloway · ‎11-04-2020

Here is a search that uses _internal to find sources that have been seen in the last 7 days, but not today. Perhaps it will help.

index=_internal component=metrics group=per_source_thruput earliest=-7d@d latest=-1d 
| stats count as old_count by series
| append [ search index=_internal component=metrics group=per_source_thruput earliest=@d 
  | stats count as new_count by series]
| stats values(*) as * by series 
| fillnull value=0 new_count 
| where new_count=0

---
If this reply helps you, Karma would be appreciated.

hectorvp · ‎11-04-2020

Thanks @richgalloway , this will help

But just need ur thought over this aspect,

group=per_source_thruput

will provide me top 10 busiest sources at every 30s.

Now suppose I have more than 10 sources for example 22 sources per host then it won't show up thruput for all 22 sources.

In the above case I need to increase maxseries in limits.conf file to 25 , I mean maxseries should be always greater than number of sources.

In this way the validation of event sources may get more reliable.

Am I right with this thought?

richgalloway · ‎11-04-2020

I think the sampling done by the per_source_thruput group will even out over the course of a day. If that doesn't work well enough then try this alternative.

index=_internal component=LicenseUsage earliest=-7d@d latest=-1d 
| stats count as old_count by s
| append [ search index=_internal component=LicenseUsage earliest=@d 
  | stats count as new_count by s]
| stats values(*) as * by s 
| fillnull value=0 new_count 
| where new_count=0

---
If this reply helps you, Karma would be appreciated.

inventsekar · ‎11-04-2020

Hi @hectorvp .. You can check this document for "routing" the data..

https://docs.splunk.com/Documentation/Splunk/8.1.0/Forwarding/Routeandfilterdatad

Check these posts:

https://community.splunk.com/t5/Getting-Data-In/How-to-route-and-filter-data-on-the-Heavy-Forwarder-...

https://community.splunk.com/t5/Getting-Data-In/Heavy-Forwarder-routing-to-two-seperate-indexer-clus...

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !

hectorvp · ‎11-04-2020

Thanks @inventsekar

Filtering logs at HF to get a single event of each source

audit

data quality

error

forwarder

forwarder management

indexer

splunkd

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation

Filtering logs at HF to get a single event of each source