Getting Data In

How to get volume by indexer?

daniel333
Builder

all,

Is there a better way to get data by indexer than this search from the search head without access to the internal indexes/

index=* 
| fields _raw, volume, splunk_server
| eval volume=len(_raw) 
| stats sum(volume) by splunk_server
0 Karma

horsefez
Motivator

@daniel333

unfortunately I don't think it is possible to get a exact understanding how much data is on your actual indexers if you don't have access to the _internal index.

However... I tried to come up with a rough estimate on how it could be done with your solution.

index=* 
| fields _raw, volume, splunk_server
| eval volume=len(_raw) 
| stats sum(volume) AS volume by splunk_server 
| eval total_size_in_GB=round((volume*8/(1024*1024*1024)),4), total_size_on_disk_in_GB=round(total_size_in_GB*0.5,4)

I added a total_size_in_GB field by multiplying the "volume" by 8 (Bit). For a lot of the standard characters you will need 8 Bit or 1 Byte to store it in memory. (If you have a lot of Chinese sign language in there this is a whole other story.)

Then I basically divide it by 1024^3 which gives me the size in GB.

I also added the total_size_on_disk_in_GB field that multiplies the total_size_in_GB field by 0.5
Why 0.5 you might ask?
There is a sizing calculator out there which uses a default value of (Raw Compression Factor (0.15) + Metadata Size Factor (0.35)) = 0.15 + 0.35 = 0.5

https://splunk-sizing.appspot.com/

So the actual size of the data on your disk is the originally calculated size multiplied by 0.5.

Another approach is you can do a
| tstats count WHERE index=* (last 24 hours)
counting the amount of events on your system.

The Splunk Sizing Calculator has an option to get you a estimate on the amount of storage you need for your indexers to store all your data. There is also an option to input a "count of events per second".

https://splunk-sizing.appspot.com/#st=eps

| tstats count WHERE index=* | eval events_per_second=count/(3600*24)

You are then able to input the events_per_second value into the sizing calculator and calculate the size of your data that goes in and out per day.

With that information you are easily able to calculate the total size of your data. (Still... this is only a rough estimate, keep that in mind)

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...