Getting Data In

how much data does a particular sourcetype ingest daily

gurinderbhatti
Path Finder

i am trying to modify the below search

index=internal metrics kb series!=* "group=per_host_thruput" daysago=5 | eval indexed_mb = kb / 1024 | timechart fixedrange=t span=1d sum(indexed_mb) by series | rename sum(indexed_mb) as totalmb

this gives me top sourcetypes and how much data they consume daily.

but what if i have a particular indexes (index=gold and index=silver) and a particular sourcetype or sourcetypes (security, windows,etc)

how do i figure out how much data these particular sourcetypes from both these indexes are ingesting daily? and if i wanted these numbers over 7 days but also wanted the mix/max/avg as well. can someone write up a quick query? thanks in advance

renems
Communicator

The SOS app has a nice view on this. You can find it in the SOS app -> indexing -> index performance. Make sure you set the view to "sourcetype". If you don't want to go through the trouble of installing the SOS app (however it's really useful), here's the search that gives the same results:

source="*metrics.log" group=per_sourcetype_thruput | timechart minspan=30s per_second(kb) by series useother=false limit=15

Above search shows all sourcetype over all indexes

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Grab the SoS app from http://apps.splunk.com/app/748/ - that should quench most of your indexing volume reporting thirst.

martin_mueller
SplunkTrust
SplunkTrust

Ah yes, the license_usage.log events may do what you need. Try this:

index=_internal source=*license_usage.log type=usage (st="WinEventLog:Security" OR st="WinEventLog:Application" OR st="WinEventLog:System") (idx=index1 OR idx=index2 OR idx=index3) | eval GB = b/1024/1024/1024 | eval st_idx = st.": ".idx | timechart span=1d sum(GB) as "Total GB used" by st_idx
0 Karma

gurinderbhatti
Path Finder

i was told not to use the metrics.log because it is not accurate and truncates data after top 20 hosts.i think adding the limit function should resolve that.
using license_usage gives 2 colums for each WinEventLog type(diff vals) y?.Also how do i tell it what indexes to query? seems like it takes all the indexes into account.
thanks in advance....
index=_internal source=*license_usage.log type=usage st="WinEventLog:Security"OR st="WinEventLog:Application"OR st="WinEventLog:System" |eval GB = b/1024/1024/1024| rename st AS sourcetype |timechart span=1d sum(GB) AS "Total GB used" by sourcetype

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

I see... I don't think the regular metrics logging lets you split by both index and sourcetype, only one or the other. You could try and compute that like this:

(index=1 OR index=2 OR index=3) (sourcetype=A OR sourcetype=B OR sourcetype=C) | eval length = length(_raw) | stats sum(length) as byte by sourcetype index | eval mb = byte / 1048576

That'll run for a while though.

0 Karma

gurinderbhatti
Path Finder

Hi Martin, thank you very much. im almost there!
so the series=A/B/C query works. Only issue is we have over 20+ windows indexes and they all collect data for these specific sourcetypes series=WinEventLog:System/Security/Application
However, i only want this data from index=1, index=2, index=4
index=_internal source="*metrics.log" index=1 group=per_sourcetype_thruput series=WinEventLog:System OR series=WinEventLog:Security OR series=WinEventLog:Application| timechart limit=0 minspan=1h per_hour(kb) by series
doesnt seem to work

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Add limit=0 to the timechart to see more than ten sourcetypes, or add series=A OR series=B OR series=C to the base search to just see your three interesting sourcetypes.

The indexes used don't matter, this shows throughput per sourcetype over all indexes.

0 Karma

gurinderbhatti
Path Finder

Hi Martin, i do have access to _internal.
thanks but that only gives me top 10 sourcetypes, what if i want sourcetype A and B and C, which exist in lets say 3 different indexes X, Y , Z.
bascially i want to get the size of the windows security/system/event logs on a daily basis. These are all being captured as different sourcetypes from a few different indexes.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do you have permission to read the _internal index? If you do then it'd be interesting to find out why your SoS charts are broken.

As for getting that manually, you can see throughput per sourcetype using this query taken from SoS:

index=_internal source="*metrics.log" group=per_sourcetype_thruput | timechart minspan=30s per_second(kb) by series
0 Karma

gurinderbhatti
Path Finder

thanks Martin,
im trying to find where to exactly get this info. seems to me the charts are broken on my sos app.
anyway i can integrate that logic into the search above?

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...