Knowledge Management

Is there a way to accurately measure data model acceleration disk usage via Splunk?

gjanders
SplunkTrust
SplunkTrust

My end-goal is to be able to measure the current data model acceleration size, preferably per-indexer but an overall measurement could work. I would like to do this without having to run du commands on each Linux indexer.

To achieve this goal I initially borrowed a query from FireBrigade which effectively does:

| rest /services/admin/summarization/tstats:DM_nmon_NMON_Config/details

And this works for some data model accelerations but not all, for example the above is not working.

I also tried:

| rest "/services/admin/introspection--disk-objects--summaries?count=-1" | stats sum(total_size) by name | addcoltotals

And

| rest "/services/admin/introspection--disk-objects--indexes?count=-1" | stats sum(datamodel_summary_size) by name | addcoltotals

Of these, the | rest "/services/admin/introspection--disk-objects--summaries?count=-1" is the closest, my total sizing for the nmon-related data models adds upto 710733 across the 5 indexers.

However on the filesystem I find 815314MB of data, so this is semi-accurate and I'm guessing it is measuring the size of the data without any OS overheads and therefore it's off by quite a bit.
Measuring the sizing per-indexer shows a similar result, it's a little bit off in sizing on each indexer and therefore the total is not right.

However the bigger issue is that the REST endpoint of:
/services/admin/introspection--disk-objects--summaries?count=-1

Is only working for some of the data models, not all of them so I cannot use this as the solution either for measuring all data models at the indexer level.

Looking at:
http://docs.splunk.com/Documentation/Splunk/6.5.2/RESTREF/RESTknowledgeExamples

https://localhost:8089/services/admin/summarization/?by_tstats=1
Should work however that returns no data at all so I cannot use that as a solution!

Any idea how I can get an accurate measure of the data model acceleration sizing without measuring on disk?

I thought of using the _splunk_summaries volume via https://localhost:8089/services/admin/introspection--disk-objects--volumes but that doesn't show total_size so it doesn't help here...

Any ideas welcome, I'm running Splunk 6.5.2

1 Solution

gjanders
SplunkTrust
SplunkTrust

Several of these accuracy issues are fixed in Splunk 6.6.x , 6.5.x has some issues with data model acceleration accuracy.

As per About upgrading to 6.6 READ THIS FIRST

Data model acceleration sizes on disk might appear to increase

If you have created and accelerated a custom data model, the size that Splunk software reports it as being on disk has increased.

When you upgrade, data model acceleration summary sizes can appear to increase by a factor of up to two to one. This apparent increase in disk usage is the result of a refactoring of how Splunk software calculates data model acceleration summary disk usage. The calculation that Splunk software performs in version 6.6 is more accurate than in previous versions.   

Note I'm using code sample as the block quote is very hard to read for multiple lines, the above is directly from the linked documentation
In 6.5.x I'm going to check at the OS level for data model acceleration sizing, however I can accurately use dbinspect (within Splunk) or introspection data for index sizing...

View solution in original post

0 Karma

gjanders
SplunkTrust
SplunkTrust

For anyone using a newer Splunk instance like Splunk 7 (and I think this will work in 6.5/6.6) I have now used the following successfully:

    index=_introspection component=summaries "data.name"=*Web
| stats latest(data.total_size) AS size by data.search_head_guid, data.related_indexes_count, data.related_indexes, host 
| stats sum(size) AS size

The above queries the Web datamodel, I also tried:

 | `datamodel("Splunk_Audit", "Datamodel_Acceleration")` | `drop_dm_object_name("Datamodel_Acceleration")` 

Which to summarise the eventual query was closer to:

| rest /services/admin/summarization by_tstats=t splunk_server=local count=0 
| eval datamodel=replace('summary.id',(("DM_" . 'eai:acl.app') . "_"),"")
| search datamodel=Web
| stats sum(summary.size)

This comes up with a number that is much smaller than what I see on the disk of the actual indexers, it just didn't add up.

| rest "/services/admin/introspection--disk-objects--summaries?count=-1"
| search name=*Web
| stats sum(total_size)

However this didn't work well if there were multiple search head guid's involved as it appeared to only return 1, so I'm now using the _introspection index query and that appears to be the most accurate...

0 Karma

gjanders
SplunkTrust
SplunkTrust

Several of these accuracy issues are fixed in Splunk 6.6.x , 6.5.x has some issues with data model acceleration accuracy.

As per About upgrading to 6.6 READ THIS FIRST

Data model acceleration sizes on disk might appear to increase

If you have created and accelerated a custom data model, the size that Splunk software reports it as being on disk has increased.

When you upgrade, data model acceleration summary sizes can appear to increase by a factor of up to two to one. This apparent increase in disk usage is the result of a refactoring of how Splunk software calculates data model acceleration summary disk usage. The calculation that Splunk software performs in version 6.6 is more accurate than in previous versions.   

Note I'm using code sample as the block quote is very hard to read for multiple lines, the above is directly from the linked documentation
In 6.5.x I'm going to check at the OS level for data model acceleration sizing, however I can accurately use dbinspect (within Splunk) or introspection data for index sizing...

0 Karma

the_wolverine
Champion

One of Gareth's answers worked for me:

| rest "/services/admin/introspection--disk-objects--summaries?count=-1" | stats sum(total_size) by name | addcoltotals

sowings
Splunk Employee
Splunk Employee

It's been a while since I looked at this, but I do intend to include it in a Fire Brigade dashboard at some point. My recollection is that the summaries=true flag to the /cluster/master/buckets endpoint (on the CM or (D)MC only, of course) will show even "non-primary" copies of the summarized data for a model. I'll try to circle back to this later this week when I can restart my lab VMs to test.

EDIT: I fired up my VMs and had a quick look at the output. It only indicates the state of the bucket's summarized data. You'd have to cross reference this with output from the data model's /details endpoint to be able to come up with a final size. Not an easy search, certainly. It may be easier to simply count the number of peers with the summary bucket from the master's point of view, and then cross reference that list with the sizes, doing the math as appropriate.

I don't yet have a search worked up to do what you want.

0 Karma

sowings
Splunk Employee
Splunk Employee

You're right, I had the flag wrong. I've edited the original post.

0 Karma

gjanders
SplunkTrust
SplunkTrust
| rest "/services/cluster/master/buckets?summaries=true&count=0" splunk_server=local | stats sum(bucket_size) by index

Returns 1 index name of _audit ... that is running from the cluster master.

0 Karma

gjanders
SplunkTrust
SplunkTrust

FYI for now I'm doing:

x=`du -ms /opt/splunk/var/lib/splunk/* | grep -vE "\.dat|\/<various directories I don't want to measure>" | grep -oE "^[0-9]+" | paste -sd+ | bc`
echo $((x / 1024)) > /home/splunk/var/datamodelsizing.txt

I then read this file and report on it across the ndexers and due to the way I've structured my Splunk indexes config, I can now report my data acceleration model disk usage accurately.
However I'd still like a Splunk-like solution to the problem...

0 Karma

jwelch_splunk
Splunk Employee
Splunk Employee
| `datamodel("Splunk_Audit", "Datamodel_Acceleration")` | `drop_dm_object_name("Datamodel_Acceleration")` | join type=outer last_sid [| rest splunk_server=* count=0 /services/search/jobs reportSearch=summarize* | rename sid as last_sid | fields last_sid,runDuration] | eval size(MB)=round(size/1048576,1) | eval retention(days)=if(retention==0,"unlimited",retention/86400) | eval complete(%)=round(complete*100,1) | eval runDuration(s)=round(runDuration,1) |  sort 100 + datamodel | fieldformat earliest=strftime(earliest, "%m/%d/%Y %H:%M:%S") | fieldformat latest=strftime(latest, "%m/%d/%Y %H:%M:%S") | fields datamodel,app,cron,retention(days),earliest,latest,is_inprogress,complete(%),size(MB),runDuration(s),last_error

Does this work?

0 Karma

rjthibod
Champion

I think you need to clarify where the datamodel "Splunk_Audit" comes from. It is not included by default in an install of 6.5.2.

0 Karma

jwelch_splunk
Splunk Employee
Splunk Employee

Shoot yeah your right it comes with Splunk_SA_CIM I believe. Okay let me keep trying.

What do you see if you go to Settings/Datamodels

Are your DM's there? If so this should show the total size on disk, but I need to find out how we get that.

0 Karma

gjanders
SplunkTrust
SplunkTrust

That actually runs:

|rest /services/admin/summarization by_tstats=t splunk_server=local count=0 | eval datamodel=replace('summary.id',"DM_".'eai:acl.app'."_","") | join type=left datamodel [| rest /services/data/models splunk_server=local count=0 | table title acceleration.cron_schedule eai:digest | rename title as datamodel | rename acceleration.cron_schedule AS cron] | table datamodel eai:acl.app summary.access_time summary.is_inprogress summary.size summary.latest_time summary.complete summary.buckets_size summary.buckets cron summary.last_error summary.time_range summary.id summary.mod_time eai:digest summary.earliest_time summary.last_sid summary.access_count | rename summary.id AS summary_id, summary.time_range AS retention, summary.earliest_time as earliest, summary.latest_time as latest, eai:digest as digest | rename summary.* AS *, eai:acl.* AS * | sort datamodel | rename access_count AS Datamodel_Acceleration.access_count access_time AS Datamodel_Acceleration.access_time app AS Datamodel_Acceleration.app buckets AS Datamodel_Acceleration.buckets buckets_size AS Datamodel_Acceleration.buckets_size cron AS Datamodel_Acceleration.cron complete AS Datamodel_Acceleration.complete datamodel AS Datamodel_Acceleration.datamodel digest AS Datamodel_Acceleration.digest earliest AS Datamodel_Acceleration.earliest is_inprogress AS Datamodel_Acceleration.is_inprogress last_error AS Datamodel_Acceleration.last_error last_sid AS Datamodel_Acceleration.last_sid latest AS Datamodel_Acceleration.latest mod_time AS Datamodel_Acceleration.mod_time retention AS Datamodel_Acceleration.retention size AS Datamodel_Acceleration.size summary_id AS Datamodel_Acceleration.summary_id | rename "Datamodel_Acceleration.*" as * | join type=outer last_sid [| rest splunk_server=* count=0 /services/search/jobs reportSearch=summarize* | rename sid as last_sid | fields last_sid,runDuration] | eval size(MB)=round(size/1048576,1) | eval retention(days)=if(retention==0,"unlimited",retention/86400) | eval complete(%)=round(complete*100,1) | eval runDuration(s)=round(runDuration,1) | sort 100 + datamodel | fields datamodel,app,cron,retention(days),earliest,latest,is_inprogress,complete(%),size(MB),runDuration(s),last_error 

The main part is:

|rest /services/admin/summarization by_tstats=t splunk_server=local count=0

This totals:
63721.8MB of data for the 3 nmon datamodels, I was thinking perhaps this ignores the replicated data models, on disk on 1 indexer I see 175664MB of data so not even close 😞

0 Karma

jwelch_splunk
Splunk Employee
Splunk Employee

So are you saying you have indexer clustering and have in server.conf on your CM summary_replication = true

And believe that this search is not returning the totals for the replicated summaries?

Okie

0 Karma

gjanders
SplunkTrust
SplunkTrust

Correct and I'm not finding a way to get an accurate value from any of the measures...or a value for every data model...

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...