Monitoring Splunk

Join time from a CSV file, and an index summary

adale25
Engager

How do I join the time field with a different field name from a CSV lookup file, with the time field specified in an index summary?

Tags (1)
0 Karma

Vijeta
Influencer

You will have to convert _time and date into the internal format using strptime, try the below code-

index=summary *avg_Total_Capacity*  | rename avg_Total_Capacity AS Total_GB
      | rename avg_Used_GB AS Used_GB| eval _time=strptime(_time,"%Y-%m-%d")
      | append

      [| inputlookup ServerName_G | fields - Disk
      | eval _time=strptime(DateLogged,"%m/%d/%Y")
      | rename DiskFreeSpaceMB as Free_GB 
      | eval Free_GB=(Free_GB/1024) 
      | rename DiskSizeMB as Total_GB
      | eval Total_GB=(Free_GB/PercentFree)*100
      | eval Used_GB=(Total_GB-Free_GB)]
      | sort - _time

     | apply "RFRCapacity_v2"  
     | timechart span=1d avg(Predict_UsedGB_Future) as Predict_UsedGB_Future
     | eval residual = 'Used_GB' - 'predicted(Used_GB)'
     | eval baseline=0
     | eval Predict_UsedGB_Future='Predict_UsedGB_Future'
0 Karma

somesoni2
Revered Legend

You can rename the time field to match the time field name in index summary in the join subsearch.

Joins are expensive and often can be replaced by alternative commands. Could you provide more details here on what's your current search is, what you're trying to accomplish, possibly with sample data/output?

0 Karma

adale25
Engager

Hey somesoni2,
I'm trying to build a predictive model using RandomForest in Splunk. I'm having issues with the inputlookup subsearch, and am trying to change it to use lookup instead and join the DateLogged field from the csv with _time from the index summary and see if the model will work as expected and push the prediction into the future. Currently its not doing that when I run the apply step on the model., if you have any suggestions for me I'm glad to hear them! Thanks!

Fit ->

index=summary *avg_Total_Capacity*  | rename avg_Total_Capacity AS Total_GB
 | rename avg_Used_GB AS Used_GB
 | append

 [| inputlookup ServerName_G | fields - Disk
 | convert timeformat="%m/%d/%Y" mktime(DateLogged) as _time 
 | rename DiskFreeSpaceMB as Free_GB 
 | eval Free_GB=(Free_GB/1024) 
 | rename DiskSizeMB as Total_GB
 | eval Total_GB=(Free_GB/PercentFree)*100
 | eval Used_GB=(Total_GB-Free_GB)]

 | sort - _time
 | timechart span=1d avg(Total_GB) AS Total_GB avg(Used_GB) AS Used_GB 
 | eval Total_GB=round(Total_GB,2)
 | eval Used_GB=round(Used_GB,2)

 | reverse
 | streamstats window=60 current=f first(Used_GB) as Predict_UsedGB_Future
 | reverse

 | fit RandomForestRegressor "Predict_UsedGB_Future" from "_time" into "RFRCapacity_v2"
 | eval DesiredCapacity_70PercUsed=(Predict_UsedGB_Future/.7)
 | eval Space_To_Add_GB=(DesiredCapacity_70PercUsed-Total_GB)
 | eval Predicted_PercUsed=(Predict_UsedGB_Future/Total_GB)*100

Apply ->

 index=summary *avg_Total_Capacity*  | rename avg_Total_Capacity AS Total_GB
     | rename avg_Used_GB AS Used_GB
     | append

     [| inputlookup ServerName_G | fields - Disk
     | convert timeformat="%m/%d/%Y" mktime(DateLogged) as _time 
     | rename DiskFreeSpaceMB as Free_GB 
     | eval Free_GB=(Free_GB/1024) 
     | rename DiskSizeMB as Total_GB
     | eval Total_GB=(Free_GB/PercentFree)*100
     | eval Used_GB=(Total_GB-Free_GB)]
     | sort - _time

    | apply "RFRCapacity_v2"  
    | timechart span=1d avg(Predict_UsedGB_Future) as Predict_UsedGB_Future
    | eval residual = 'Used_GB' - 'predicted(Used_GB)'
    | eval baseline=0
    | eval Predict_UsedGB_Future='Predict_UsedGB_Future'
0 Karma

Vijeta
Influencer

what is the time/Date format in your lookup table ?

0 Karma

adale25
Engager

Format for the csv file ->

    DateLogged
1   5/9/2016
2   5/9/2016
3   5/10/2016
4   5/10/2016
5   5/11/2016

Format from the summary index ->

    _time
1   2018-11-02
2   2018-07-22
3   2018-07-20
4   2018-07-13
5   2018-07-09
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...