The relevant data about the future development of CustID are read in via a lookup (new_custID.csv) based on the table:
| search index=myindex sourcetype=mysourcetype | stats earliest(_time) as first_seen latest(_time) by CustID | inputlookup append=t new_custID.csv | stats min(frist_seen) as first_seen max(last_seen) by CustID | output lookup new_custID.csv
My goal is to build dashboards with the predicted amount of data depending on the connection projects in order to get an overview of the imported amount of data in Splunk (Licensing costs) , so that I can link these new findings in my query:
| search index=myindex sourcetype=mysourcetype | eval esize=len(_raw) | stats sum(esize) as Volume_of_Data
The questions arise:
1) How can one automatically in Lookup recognize which concrete CustIDs are in operation / have been newly added after a certain period of time (for example, statistically seen monthly)? (Extrapolation).
2) A selection list can be used to select the times mentioned in the above lookup.
After selecting a point in time, the predicted amount of data for this point in time is displayed.
If it follows a linear trend then this should be real easy to do. Check out my CONF talk that is related to this and shows how to build a model in the MLTK. I can't help with your use case until a see a line chart of the growth