Solved: What is the best approach for datamodels and how t...

robertlynch2020 · ‎03-22-2018

Hi

I need advice on the best approach for Datamodels and how to preload in the data on know points of interest.

I have developed a Splunk_ID system in Splunk that will remember the start end and host name in a lookup table.
So, for example, a user can store last weeks activity as Splunk_ID=123. This makes it quicker to get to your data.

However this can take 30 seconds to load, 200 Million lines of complex tstat into a base search of a dashboard.
How can I get splunk to have this ready in memory?

Can I get a saved search to run and put it into a base search?
Should I use an Add a "root search dataset" to get the query ready in memory?

What I would like to do is I have over 1000 SPLUNK_ID. I would like to get splunk to have the last 100 in memory and ready for use.

Below is the base search. It feeds about 15 drop downs or visualizations.

| tstats summariesonly=$summariesonly_token$ max(MXTIMING.Elapsed) AS Elapsed max(MXTIMING.CPU) AS CPU max(MXTIMING.CPU_PER) AS CPU_PER values(MXTIMING.RDB_COM1) AS RDB_COM values(MXTIMING.RDB_COM_PER1) AS RDB_COM_PER max(MXTIMING.Memory) AS Memory max(MXTIMING.Elapsed_C) AS Elapsed_C values(source) AS source_MXTIMING avg(MXTIMING.Elapsed) AS average, count(MXTIMING.Elapsed) AS count, stdev(MXTIMING.Elapsed) AS stdev, median(MXTIMING.Elapsed) AS median, exactperc95(MXTIMING.Elapsed) AS perc95, exactperc99.5(MXTIMING.Elapsed) AS perc99.5, min(MXTIMING.Elapsed) AS min,earliest(_time) as start, latest(_time) as stop FROM datamodel=$MXTIMING_DATASET$ WHERE 

    host=$host_token$ 
AND MXTIMING.Elapsed &gt; $MAX_TIME$ 


GROUPBY _time MXTIMING.Machine_Name MXTIMING.Context+Command MXTIMING.NPID MXTIMING.Date MXTIMING.Time MXTIMING.MXTIMING_TYPE_DM source MXTIMING.UserName2 MXTIMING.source_path MXTIMING.Command3 MXTIMING.Context3 span=1s
| rename MXTIMING.Context+Command as Context+Command 
| rename MXTIMING.NPID as NPID 
| rename MXTIMING.MXTIMING_TYPE_DM as TYPE 
| rename MXTIMING.Date as Date 
| rename MXTIMING.Time as Time 
| rename MXTIMING.Machine_Name as Machine_Name 
| rename MXTIMING.UserName2 as UserName
| rename MXTIMING.source_path  as source_path
| eval Date=strftime(strptime(Date,"%Y%m%d"),"%d/%m/%Y") 
| eval Time = Date." ".Time
| eval FULL_EVENT=Elapsed_C 
| eval FULL_EVENT=replace(FULL_EVENT,"\d+.\d+","FULL_EVENT") 
| join Machine_Name NPID type=$join_type_token$ 
[| tstats summariesonly=$summariesonly_token$ count(SERVICE.NPID) AS count2 values(source) AS source_SERVICES FROM datamodel=SERVICE_V5 WHERE ( host=$host_token$ earliest=$service_earliest_time$ latest=$service_latest_time$) AND SERVICE.NICKNAME IN ($NICKNAME_TOKEN$)
GROUPBY SERVICE.Machine_Name SERVICE.NICKNAME SERVICE.NPID 
| rename SERVICE.NPID AS NPID 
| rename SERVICE.NICKNAME AS NICKNAME 
| rename SERVICE.Machine_Name as Machine_Name 
| table NICKNAME NPID source_SERVICES Machine_Name ] 
| lookup MXTIMING_lookup_Base Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert  
| appendpipe
[ | where isnull(Threshold)
| rename TYPE AS BACKUP_TYPE
| eval TYPE="*"     
| lookup MXTIMING_lookup_Base Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert
| rename BACKUP_TYPE AS TYPE]
| dedup Time, NPID,Context+Command
| where Elapsed > Threshold OR isnull('Threshold')
| fillnull Tags 
| eval Tags=if(Tags=0,"PLEASE_ADD_TAG",Tags) 
| makemv Tags delim=","
| eval Tags=split(Tags,",") 
| search Tags IN ($TAG_TOKEN$) 
| sort Elapsed | eval source_SERVICES_count=mvcount(split(source_SERVICES, " ")) | eval NICKNAME=if(source_SERVICES_count > 1, "MULTIPLE_OPTIONS_FOUND",NICKNAME)

Thanks in Advance
Robert Lynch

woodcock · ‎01-22-2019

It very much looks to me like you are dealing with numbers and that absolutely screams metrics store. I am not sure if you have kept up on this but the benefits are way worth the pain of converting everything:

https://docs.splunk.com/Documentation/Splunk/latest/Metrics/GetStarted
https://www.slideshare.net/Splunk/splunklive-munich-2018-integrating-metrics-and-logs

View solution in original post

woodcock · ‎01-22-2019

It very much looks to me like you are dealing with numbers and that absolutely screams metrics store. I am not sure if you have kept up on this but the benefits are way worth the pain of converting everything:

https://docs.splunk.com/Documentation/Splunk/latest/Metrics/GetStarted
https://www.slideshare.net/Splunk/splunklive-munich-2018-integrating-metrics-and-logs

robertlynch2020 · ‎01-23-2019

Hi

Thanks, i will look into this.

One of my issues is i need the value (string) not just numbers(stats) to come back sometimes, so i am not 100% sure if i will be able to use it. As i got the feeling metrics. I will try it out and get back.
I will mark the answer as accepted for the moment and i will report back.

Rob

What is the best approach for datamodels and how to preload in the data on known points of interest?

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!