Getting Data In

Internal Index volume

hectorvp
Communicator

Just for a sake of knowledge, how much amount of _internal data is generated.

Incase my daily indexing is of 6TB???

Will it 15% of 6TB?

I know it doesn't consume my license...

Labels (3)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
Hardly said any exact number as there are lot of things which need to count in. E.g. how many UF, HF, other inputs, which kind of distributed environment you have, how many and how actively they are using it etc.
You should look what is normal for your own environment with MC or use some queries for that.
r. Ismo

hectorvp
Communicator

Hi @isoutamo ,

We are currently having 500UFs and no HFs and scope is to fetch only UFs internal logs.

We are using single box for indexing,search head and as a DS, coz main purpose is to forward logs to 3rd party destn servers.

So we are suppose to only store internal logs, need to plan how much disk space will be required, currently we opting with 1TB but I guess this capacity planning has to be revisited. We need atleast retention policy of 60days.

Any suggestions on this, what could be the approach, we don't have actual access to 3rd party servers to check volume by using queries.

0 Karma

hectorvp
Communicator

Hi @isoutamo ,

I did the following to make rough estimate that is

avgerae size of internal logs = 300 bytes 

average events per seconds by internal index with 1 UF = 10

total internal data at  1 day in MB by 1UF= 10 * 300 * 60 *60 *24 / (1000*1000) = 260 MB

So for   1 day with 500UFs = 26 * 500 = 130000= 130GB in  day

Compression ratio of 50% so total 1 day data =  65GB

so 60 days of retention capacity then = 65*60 = approx 4TB

Am I going in right direction???  Or missing any factors?

0 Karma
Get Updates on the Splunk Community!

Celebrating Fast Lane: 2025 Authorized Learning Partner of the Year

At .conf25, Splunk proudly recognized Fast Lane as the 2025 Authorized Learning Partner of the Year. This ...

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...