Getting Data In

How can I determine the total amount of data received and indexed by UFs?

Glasses2
Communicator

Hi 
I am working on a query to determine the hourly (or daily) totals of all indexed data (in GBs) coming from UFs.

In our deployment, UFs send directly to the Indexer Cluster.  

The issue I am having w/ the following query, is that the volume is not realistic, and I am probably misunderstanding the _internal metrics log.  Perhaps the kb field is not the correct field to sum as data thruput?

 

 

index=_internal source=*metrics.log group=tcpin_connections fwdType=uf 
| eval GB = kb/(1024*1024) 
| stats sum(GB) as GB

 

 

 

Any advice appreciated.
Thank you

Labels (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The Metrics log is a sample of events, not an audit log.

---
If this reply helps you, Karma would be appreciated.

isoutamo
SplunkTrust
SplunkTrust

Here is some comments about metrics.log

By default, metrics.log reports the top 10 results for each type.

see more from https://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Aboutmetricslog

As you could see metrics.log don’t contains all metrics, it’s just “samples” of those. As @richgalloway already said you must use license_usage or calculate that from _raw.

 

0 Karma

Glasses2
Communicator

Thank you for your reply, do you have a method of querying to get an answer for my question?
I am not finding the key logs containing UF data thruput or ingest information.  

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The most accurate method would be to add up the size of _raw for each UF (host), but that would have terrible performance.

Try using the license_usage log.  The h field is the host (UF) sending the data.

index=_internal source=*license_usage.log
| stats sum(b) as bytes by h
| eval KB = bytes/1024
| rename h as UF
| table UF KB
---
If this reply helps you, Karma would be appreciated.

Glasses2
Communicator

Thank you for the reply.  I also looked at this log but it requires curating an exact list of the UFs, bc I have some pollution, e.g. h= HFs, SC4S, etc.  The license_usage log may be the best route if I can put together a lookup of just UFs.

0 Karma

Glasses2
Communicator

 

 

 

index=_internal source=*license_usage.log earliest=-1d@d latest=now [search index=_internal source=*metrics.log fwdType=uf earliest=-1d@d latest=now | rename hostname as h | fields h] 
| stats sum(b) as total_usage_bytes by h 
| eval total_usage_gb = round(total_usage_bytes/1024/1024/1024, 2) 
| fields - total_usage_bytes
| addcoltotals label="Total" labelfield="h" total_usage_gb

 


I think this is what I wanted, unless someone thinks its inaccurate? 
Please advise.
TY

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If you're confident the sampling done for metrics.log will catch all of your UFs then the search looks good.

---
If this reply helps you, Karma would be appreciated.

Glasses2
Communicator

The numbers are not exact, from the DS Forwarder Management > 1275, dc(h) from metrics > 1287, and the total stats count from the final query > 1166  so its not accurate.  I will need to create a lookup of UFs.

Thank you for your support.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...