Getting Data In

How to get volume by indexer?

daniel333
Builder

all,

Is there a better way to get data by indexer than this search from the search head without access to the internal indexes/

index=* 
| fields _raw, volume, splunk_server
| eval volume=len(_raw) 
| stats sum(volume) by splunk_server
0 Karma

horsefez
Motivator

@daniel333

unfortunately I don't think it is possible to get a exact understanding how much data is on your actual indexers if you don't have access to the _internal index.

However... I tried to come up with a rough estimate on how it could be done with your solution.

index=* 
| fields _raw, volume, splunk_server
| eval volume=len(_raw) 
| stats sum(volume) AS volume by splunk_server 
| eval total_size_in_GB=round((volume*8/(1024*1024*1024)),4), total_size_on_disk_in_GB=round(total_size_in_GB*0.5,4)

I added a total_size_in_GB field by multiplying the "volume" by 8 (Bit). For a lot of the standard characters you will need 8 Bit or 1 Byte to store it in memory. (If you have a lot of Chinese sign language in there this is a whole other story.)

Then I basically divide it by 1024^3 which gives me the size in GB.

I also added the total_size_on_disk_in_GB field that multiplies the total_size_in_GB field by 0.5
Why 0.5 you might ask?
There is a sizing calculator out there which uses a default value of (Raw Compression Factor (0.15) + Metadata Size Factor (0.35)) = 0.15 + 0.35 = 0.5

https://splunk-sizing.appspot.com/

So the actual size of the data on your disk is the originally calculated size multiplied by 0.5.

Another approach is you can do a
| tstats count WHERE index=* (last 24 hours)
counting the amount of events on your system.

The Splunk Sizing Calculator has an option to get you a estimate on the amount of storage you need for your indexers to store all your data. There is also an option to input a "count of events per second".

https://splunk-sizing.appspot.com/#st=eps

| tstats count WHERE index=* | eval events_per_second=count/(3600*24)

You are then able to input the events_per_second value into the sizing calculator and calculate the size of your data that goes in and out per day.

With that information you are easily able to calculate the total size of your data. (Still... this is only a rough estimate, keep that in mind)

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Quantify Your Splunk Investment Impact: Introducing Savings Metrics to Value Insights

Building on the foundation established in our initial Value Insights releases, we are introducing the Savings ...

Event Series: Telemetry Pipeline Management

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...