Difference in daily ingested GB & Daily Searchable...

tv00638481 · ‎01-22-2024

Hi

We have a cloud instance , however we would like have predictive storage analysis for feature requirement. As part of the i was trying to look after the accurate options available. While look at it ,i noticed that our daily ingestion data is around 450-500 GB but when i check the Searchable storage (DDAS) has increased around 60GB compared to previous day.

Could you please let me know whether i'm missing anything while doing this calculations.

Secondly, is there a way to do predictable SVC & Storage analysis (DDAS & DDAA) for future requirement.

tv00638481 · ‎02-06-2024

Basically - How to get the below information as this critical for us to know it on daily basis as service provider.

1.How much data ingested to Splunk on daily basis. (We have query for that).

2.How much data is getting stored on active searchable storage.(Assuming the same should be reflecting in the active searchable storage) on daily basis.

3.How much data has been moved from searchable active(Online Storage) to Active Archive(Offline Storage) on daily basis.

4.How much data has been purged/deleted from Active Archive(Offline Storage) daily.

mattymo · ‎02-06-2024

All these questions can be answered in the cloud monitoring console and you should start there instead of trying to write your own bespoke spl.

License > Storage overview is a great place to start. There is also DDAS and DDAA searches there.

specifically for archive data please review here as per docs:

https://docs.splunk.com/Documentation/SplunkCloud/9.1.2308/Admin/DataArchiver

Steps to review the overall size and growth of your archived indexes

You might want to review the size and growth of your archived indexes to better understand how much of your entitlement you are consuming. This can help you predict usage and expenses for your archived data.

From Splunk Web, go to Settings > Indexes.
From the Indexes page, click on a value in the Archive Retention column.

- MattyMo

tv00638481 · ‎01-27-2024

Thank you for the response.
You mean to say the daily ingested data is compressed whenever it is coming to online active storage due to the reason , I'm only noticing a difference of ~60GB on day 2 day basis.is my understanding correct.

mattymo · ‎01-27-2024

No. I suggest follow up with your account team so they can see what values you are comparing and ensure they are accurate. (Or post what actual values you are looking at.

All your storage forecasting should be is how much RAW data do you ingest a day, mulltipled by how many days you want to store in searchable + archive, and does that fit into your DDAS and DDAA entitlement?

Splunk Cloud service takes care of the rest. No compression math at all vs on-prem days.

- MattyMo

mattymo · ‎01-25-2024

Cloud monitoring console should provide a great start on analyzing your storage needs.
https://docs.splunk.com/Documentation/SplunkCloud/9.1.2308/Admin/MonitoringLicenseUsage#Monitor_the_...

The key concept you must be familiar with, is the Splunk bucket lifecycle, as buckets are the smallest form of storage in Splunk, and impacts greatly how and when your buckets move from active searchable to active archive.

I wouldnt over complicate it with compression. While Splunk does compress data, your entitlements are on raw data ingested, so just closely analyze your daily ingest in your biggest indexes and poke around with the `dbinspect` command and the monitoring console to ensure your bucket health is good.

Data onboarding and data quality is key to ensure your timestamps dont pollute your buckets with timestamps way in the past or future, because a bucket can only migrate to archive when ALL EVENTS in the bucket meet the time/size criteria. https://docs.splunk.com/Documentation/SplunkCloud/9.1.2308/Admin/MonitoringHealth#Health_indicator_i...

https://docs.splunk.com/Documentation/SplunkCloud/9.1.2308/Admin/MonitoringIndexing#Verify_data_qual...

Also even going back and reading Splunk Enterprise docs on "smartstore" will help provide you with some good background, or work with your account team to go through it and ensure you have a good handle on it.

- MattyMo

tv00638481 · ‎01-27-2024

Thank you let me go thru the documentation.

scelikok · ‎01-22-2024

Hi @tv00638481,

Since DDAS is archive storage, Splunk Cloud keeps only compressed raw data. Compression depends on the data content but is estimated to be around %15 of the raw data. In your case, 60GB is normal for 400-500GB ingestion.

You can make calculation based on above information.

If this reply helps you an upvote and "Accept as Solution" is appreciated.

mattymo · ‎01-27-2024

I think u mean DDAA (dynamic data active archive) is archive storage (cold storage), and no we don't only store compressed raw data. Whole bucket goes to archive storage then us copied back into object store upon restore.

DDAS (dynamic data active searchable) is basically "smartstore".

Customers using ddas and ddaa only need to care about the raw size, not compression. Compression doesnt come into play with splunk cloud entitlements, so the old onprem math doesnt apply.

now depending on what data hes actually looking at, he may or may not be seeing compressed buckets, but more often it comes down to understanding when buckets will actually roll due to timestamps in the bucket.

overall in splunk cloud all you care about is raw data size as you are not sizing disk in cloud, u are sizing your subscription.

https://docs.splunk.com/Documentation/SplunkCloud/9.1.2308/Service/SplunkCloudservice

see storage section

- MattyMo

Difference in daily ingested GB & Daily Searchable Dynamic Data Active Storage(GB)

using Splunk Cloud

Steps to review the overall size and growth of your archived indexes

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

Explore the Latest Educational Offerings from Splunk (November Releases)

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...