All,
I am working on project to "predict" how much Splunk license I may need in order to onboard new customer. Usually we ingest the same information for all customers, what is the main difference is the number of entries in the logs.
The problem I am having is that I cannot trust on the _internal metrics.log of my indexers. Looks like it does not have all information. For example, if I run:
index=ssn host=*xey* earliest="01/31/2019:1:29:00" latest="01/31/2019:2:29:00" | stats count by host
1 lasxeypr01dem01.las.ssnsgs.net 107
2 lasxeypr01slv01.las.ssnsgs.net 120
3 lasxeypr01vmw01.las.ssnsgs.net 28865
4 lasxeypr01vmw02.las.ssnsgs.net 12242
At the same time:
index="_internal" source="*metrics.log" group="per_host_thruput" earliest="01/31/2019:1:29:00" latest="01/31/2019:2:29:00" series=*xey* | chart sum(kb) by series | sort - sum(kb)
No results found.
Some data is there:
index="_internal" source="*metrics.log" group="per_host_thruput" earliest="01/31/2019:1:29:00" latest="01/31/2019:2:29:00" | stats count by host
1 arnvtnpr01spl01.arn.ssnsgs.net 117
2 iadphite01spl01.iad.ssnsgs.net 116
3 janentpr01spl01.jan.ssnsgs.net 116
4 lascocpr01mys01.las.ssnsgs.net 116
5 lascocpr01mys02.las.ssnsgs.net 117
6 lascocpr01mys03.las.ssnsgs.net 116
7 lashrmpr01kaf05 117
8 lashrmpr01wor02 116
9 lasssnpr01spl01.las.ssnsgs.net 1160
10 lasssnpr01spl02.las.ssnsgs.net 1160
11 lasssnpr01spl03.las.ssnsgs.net 1160
12 lasssnpr01spl04.las.ssnsgs.net 679
13 lasssnpr01spl05.las.ssnsgs.net 170
14 lasssnpr01spl06.las.ssnsgs.net 116
15 lasssnpr01spl07.las.ssnsgs.net 116
16 lasssnpr01spl08.las.ssnsgs.net 213
17 lasssnspl01app01.las.ssnsgs.net 188
18 lcxfplpr02spl01.fpl.ssnsgs.net 1160
19 litentpr02spl01.lit.ssnsgs.net 117
20 okcogepr02spl01.okc.ssnsgs.net 116
21 pdxpcfte01spl01.pdx.ssnsgs.net 152
22 phlphipr01spl01.phl.ssnsgs.net 117
23 sanssnpoc02slv01.san.ssnsgs.net 116
24 sanssnpr01spl01.san.ssnsgs.net 1160
25 sanssnpr01spl02.san.ssnsgs.net 1160
26 sanssnpr01spl03.san.ssnsgs.net 1160
27 sanssnpr01spl04.san.ssnsgs.net 125
28 sanssnpr01spl05.san.ssnsgs.net 160
29 sanssnpr01spl06.san.ssnsgs.net 195
30 sanssnpr01spl10 1160
I know for sure I have data ingested for these hosts.
So, how can get the exactly amount of data that is indexed? It is some rotation on the _internal index that I am missing?
Thank you,
Gerson
Hi @GersonGarcia
The metrics.log can squash or summarise the metrics for source, sourcetype or host if there are too many. If you need exact data and you don't mind this query being slow, then you can do this: <search> | eval len = len(_raw) | stats sum(len) as bytes
@chrisyoungerjds I believe I can find in a different log:
index=_internal sourcetype=splunkd group=tcpout_connections host=*xey* earliest="01/31/2019:1:29:00" latest="01/31/2019:2:29:00" | chart sum(kb) by host | sort - sum(kb)
1 lasxeypr01vmw01.las.ssnsgs.net 19532.55
2 lasxeypr01vmw02.las.ssnsgs.net 17314.92
3 lasxeypr01nan01.las.ssnsgs.net 1520.58
4 lasxeypr01sla01.las.ssnsgs.net 1393.90
5 lasxeypr01gpl01.las.ssnsgs.net 1360.50
6 lasxeypr01dem01.las.ssnsgs.net 1283.92
7 lasxeypr01vmw03.las.ssnsgs.net 1269.57
8 sanxeyte01dem01.san.ssnsgs.net 1233.25
The problem here is that if I have any transformation before ingestion, it will be lost...
Hi @GersonGarcia
The metrics.log can squash or summarise the metrics for source, sourcetype or host if there are too many. If you need exact data and you don't mind this query being slow, then you can do this: <search> | eval len = len(_raw) | stats sum(len) as bytes
Humm the problem is it will take forever to complete the search for all hosts past day:
host=xey earliest=-1d@d latest=@d | eval len = len(_raw) | stats sum(len) as bytes by index host
1 main lasxeypr01slv01.las.ssnsgs.net 24430
2 os lasxeypr01dem01.las.ssnsgs.net 10702044
3 os lasxeypr01gpl01.las.ssnsgs.net 11615854
4 os lasxeypr01nan01.las.ssnsgs.net 19561100
5 os lasxeypr01sla01.las.ssnsgs.net 14134946
6 os lasxeypr01vmw01.las.ssnsgs.net 111012962
7 os lasxeypr01vmw02.las.ssnsgs.net 56708985
8 os lasxeypr01vmw03.las.ssnsgs.net 9954705
9 os sanxeyte01dem01.san.ssnsgs.net 9743627
10 ssn lasxeypr01dem01.las.ssnsgs.net 569558
11 ssn lasxeypr01slv01.las.ssnsgs.net 3102610
12 ssn lasxeypr01vmw01.las.ssnsgs.net 135302275
13 ssn lasxeypr01vmw02.las.ssnsgs.net 51478532
This search has completed and has returned 13 results by scanning 1,724,992 events in 86.817 seconds
yes that is the downside. The only real solution I can offer is with estimation. Basically run this query over a smaller time range to find out how large events typically are:
host=xey earliest=-1d@d latest=@d | eval len = len(_raw) | stats avg(len) as avg_bytes by index host
then you can run a super fast tstats command to get the count of events per index host
|tstats count where index = _internal sourcetype=splunkd by host index
and you can multiply the numbers together to determine approx how much data was used by each host.
Yeah, I guess I could, but the problem is the log size depends of many factors, and it is never the same in two hosts...
Thank you for your help.