Solved: What is the best approach for datamodels and how t...

robertlynch2020 · ‎03-22-2018

Hi

I need advice on the best approach for Datamodels and how to preload in the data on know points of interest.

I have developed a Splunk_ID system in Splunk that will remember the start end and host name in a lookup table.
So, for example, a user can store last weeks activity as Splunk_ID=123. This makes it quicker to get to your data.

However this can take 30 seconds to load, 200 Million lines of complex tstat into a base search of a dashboard.
How can I get splunk to have this ready in memory?

Can I get a saved search to run and put it into a base search?
Should I use an Add a "root search dataset" to get the query ready in memory?

What I would like to do is I have over 1000 SPLUNK_ID. I would like to get splunk to have the last 100 in memory and ready for use.

Below is the base search. It feeds about 15 drop downs or visualizations.

| tstats summariesonly=$summariesonly_token$ max(MXTIMING.Elapsed) AS Elapsed max(MXTIMING.CPU) AS CPU max(MXTIMING.CPU_PER) AS CPU_PER values(MXTIMING.RDB_COM1) AS RDB_COM values(MXTIMING.RDB_COM_PER1) AS RDB_COM_PER max(MXTIMING.Memory) AS Memory max(MXTIMING.Elapsed_C) AS Elapsed_C values(source) AS source_MXTIMING avg(MXTIMING.Elapsed) AS average, count(MXTIMING.Elapsed) AS count, stdev(MXTIMING.Elapsed) AS stdev, median(MXTIMING.Elapsed) AS median, exactperc95(MXTIMING.Elapsed) AS perc95, exactperc99.5(MXTIMING.Elapsed) AS perc99.5, min(MXTIMING.Elapsed) AS min,earliest(_time) as start, latest(_time) as stop FROM datamodel=$MXTIMING_DATASET$ WHERE 

    host=$host_token$ 
AND MXTIMING.Elapsed &gt; $MAX_TIME$ 


GROUPBY _time MXTIMING.Machine_Name MXTIMING.Context+Command MXTIMING.NPID MXTIMING.Date MXTIMING.Time MXTIMING.MXTIMING_TYPE_DM source MXTIMING.UserName2 MXTIMING.source_path MXTIMING.Command3 MXTIMING.Context3 span=1s
| rename MXTIMING.Context+Command as Context+Command 
| rename MXTIMING.NPID as NPID 
| rename MXTIMING.MXTIMING_TYPE_DM as TYPE 
| rename MXTIMING.Date as Date 
| rename MXTIMING.Time as Time 
| rename MXTIMING.Machine_Name as Machine_Name 
| rename MXTIMING.UserName2 as UserName
| rename MXTIMING.source_path  as source_path
| eval Date=strftime(strptime(Date,"%Y%m%d"),"%d/%m/%Y") 
| eval Time = Date." ".Time
| eval FULL_EVENT=Elapsed_C 
| eval FULL_EVENT=replace(FULL_EVENT,"\d+.\d+","FULL_EVENT") 
| join Machine_Name NPID type=$join_type_token$ 
[| tstats summariesonly=$summariesonly_token$ count(SERVICE.NPID) AS count2 values(source) AS source_SERVICES FROM datamodel=SERVICE_V5 WHERE ( host=$host_token$ earliest=$service_earliest_time$ latest=$service_latest_time$) AND SERVICE.NICKNAME IN ($NICKNAME_TOKEN$)
GROUPBY SERVICE.Machine_Name SERVICE.NICKNAME SERVICE.NPID 
| rename SERVICE.NPID AS NPID 
| rename SERVICE.NICKNAME AS NICKNAME 
| rename SERVICE.Machine_Name as Machine_Name 
| table NICKNAME NPID source_SERVICES Machine_Name ] 
| lookup MXTIMING_lookup_Base Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert  
| appendpipe
[ | where isnull(Threshold)
| rename TYPE AS BACKUP_TYPE
| eval TYPE="*"     
| lookup MXTIMING_lookup_Base Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert
| rename BACKUP_TYPE AS TYPE]
| dedup Time, NPID,Context+Command
| where Elapsed > Threshold OR isnull('Threshold')
| fillnull Tags 
| eval Tags=if(Tags=0,"PLEASE_ADD_TAG",Tags) 
| makemv Tags delim=","
| eval Tags=split(Tags,",") 
| search Tags IN ($TAG_TOKEN$) 
| sort Elapsed | eval source_SERVICES_count=mvcount(split(source_SERVICES, " ")) | eval NICKNAME=if(source_SERVICES_count > 1, "MULTIPLE_OPTIONS_FOUND",NICKNAME)

Thanks in Advance
Robert Lynch

woodcock · ‎01-22-2019

It very much looks to me like you are dealing with numbers and that absolutely screams metrics store. I am not sure if you have kept up on this but the benefits are way worth the pain of converting everything:

https://docs.splunk.com/Documentation/Splunk/latest/Metrics/GetStarted
https://www.slideshare.net/Splunk/splunklive-munich-2018-integrating-metrics-and-logs

View solution in original post

woodcock · ‎01-22-2019

It very much looks to me like you are dealing with numbers and that absolutely screams metrics store. I am not sure if you have kept up on this but the benefits are way worth the pain of converting everything:

https://docs.splunk.com/Documentation/Splunk/latest/Metrics/GetStarted
https://www.slideshare.net/Splunk/splunklive-munich-2018-integrating-metrics-and-logs

robertlynch2020 · ‎01-23-2019

Hi

Thanks, i will look into this.

One of my issues is i need the value (string) not just numbers(stats) to come back sometimes, so i am not 100% sure if i will be able to use it. As i got the feeling metrics. I will try it out and get back.
I will mark the answer as accepted for the moment and i will report back.

Rob

What is the best approach for datamodels and how to preload in the data on known points of interest?

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine

Are you a member of the Splunk Community?

What is the best approach for datamodels and how to preload in the data on known points of interest?

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine