Yes, you can think of a data model as a way of creating a database-like schema on top of your existing Splunk data. To build a data model, you'll first need to do all of the normal things you do when you add data to Splunk, like write line breakers and extract fields. Once you have those steps completed, then you can build a data model. The data model itself is just a json document that defines the searches and fields that make up the model.
With a data model you can use the pivot UI to interactively build charts and dashboards without having to write Splunk queries. That way your users don't have to know the Splunk query language. This is one reason you'd create a data model; to provide a well-defined data set for your users that they can analyze without having to know a lot about writing Splunk queries.
Another reason for creating a data model is to enable acceleration and improve search performance. If you frequently report on the same fields within the same data set you can create a data model and enable data model acceleration. Then Splunk will run search jobs in the background to build a separate index for your data model. Data model acceleration has some advantages over other methods of accelerating searches such as summary indexing or report acceleration. You can review the docs to find more information about enabling acceleration. And its optional; data models aren't accelerated by default.
If you haven't gone through it yet, try out the pivot tutorial in Splunk. And if you have some specific questions about data models, feel free to ask.
Thank you very much for answering and for your help!
So if my query starts with: tstats summaries=T from datamodel=blah then it is performing the search from accelerated data within the datamodel only?
Also, if you were trying to replicate the search in a different search processing language, is there an equivalent to datamodels? What I mean by that is: filtering is standard to all search languages but I've yet to encounter datamodels, or maybe it's because I'm not understanding them right. I see them as sort of a filter, like: datamodel=blah filters to only look in the set of data contained within the datamodel "blah". So you won't see all your data, just the data within the datamodel "blah"? And how does the data get into the datamodel "blah"? Do I have the description for datamodel right?
Thank you for your help
I think you are referring to the summariesonly=T flag. That flag tells tstats to only use accelerated data to perform the search.
I think SQL views are a good comparison. Data models are like a view in the sense that they abstract away the underlying tables and columns in a SQL database. In Splunk, a data model abstracts away the underlying Splunk query language and field extractions that makes up the data model. And like data models, you can accelerate a view. In Splunk, you enable data model acceleration. In SQL, you accelerate a view by creating indexes. Keep in mind that this is a very loose comparison. But the concepts are similar.
Sorry but I don't have any experience with SQL. How I understand them is I see them as sort of a filter, like: datamodel=blah filters to only look in the set of data contained within the datamodel "blah". So you won't see all your data, just the data within the datamodel "blah"?
Is that an accurate description? Also, if that is accurate, how does the data get into the datamodel "blah"?
Thank you again!
Yes thats accurate. There are other ways of creating predefined filters in Splunk, like with macros and event types. So a data model does do filtering, but it does a lot more than that as well. But yes, when you search a datamodel, you are only returning data that matches the model.
The model itself is just a json document that defines the searches and fields that make up the model. You define the model from within the Splunk UI. The links I added above walk you through defining a model, so take a look at those if you want the details.