How would you explain the concept of a Splunk Data Model to, say, your mother?
While thinking of this question, I thought of the popular Reddit forum called ELI5 (Explain Like I'm 5).
I am asking because I am looking for a good way to explain the concept to an audience with multiple levels of technical understanding.
So, how would you explain Splunk Data Models?
let me have a go 🙂
Creating neat, informative summaries out of huge lists of raw data is a common challenge. Today fortunately many software programs do it with ease and one of the such representation is called Pivot tables. You might have used pivot tables in Microsoft Excel. Pivot tables quickly summarize long lists of data into neat summary for end-user; hiding all formulae and complex calculation underneath. Moreover the end user can do drag and drop, rearrange of such pivot tables to customise their reports.
In todays system, data is too complex to create a straight forward pivot summary. Before creating pivot, you need to extract and enrich the raw data to make it useful and write all formulae and complex calculations. This stage is called data modelling and the output is called as a data model. Multiple data model(s) can be used to create a single pivot. Also a data model can be have child elements thus preserving any hierarchy if you require. Splunk additionally provides ability to make the data modelling faster using a method called as datamodel acceleration which is tremendously useful for huge data organisations.
In Summary, for the end use you require a drag and drop functionality for reports from very complex set of data, which in-turn requires datamodels
I just found a really awesome presentation on this from Conf 2014
https://conf.splunk.com/session/2014/conf2014_DavidClawson_Splunk_WhatsNew.mp4
let me have a go 🙂
Creating neat, informative summaries out of huge lists of raw data is a common challenge. Today fortunately many software programs do it with ease and one of the such representation is called Pivot tables. You might have used pivot tables in Microsoft Excel. Pivot tables quickly summarize long lists of data into neat summary for end-user; hiding all formulae and complex calculation underneath. Moreover the end user can do drag and drop, rearrange of such pivot tables to customise their reports.
In todays system, data is too complex to create a straight forward pivot summary. Before creating pivot, you need to extract and enrich the raw data to make it useful and write all formulae and complex calculations. This stage is called data modelling and the output is called as a data model. Multiple data model(s) can be used to create a single pivot. Also a data model can be have child elements thus preserving any hierarchy if you require. Splunk additionally provides ability to make the data modelling faster using a method called as datamodel acceleration which is tremendously useful for huge data organisations.
In Summary, for the end use you require a drag and drop functionality for reports from very complex set of data, which in-turn requires datamodels
I like it!
Without Data Model
Programmers : Write own program to create data analysis
Splunker: Write splunk search to populate fancy report/dashboard
Managers: Neglect to learn splunk search, but would like to impress his/her boss by fancy report chart . So usually he/she ask splunker for help.
Directors/CEO: Just want to see result(report). Do not care how to generate it
With Data Model
1. Managers ask Splunker to help easy way to create fancy chart
2. Splunker create Splunk Data Model which defines all field names from raw data
=> Managers will stop asking to create reports or splunk search.
3. Managers use Pivot feature ( which become available by Data Model), and just drag and drop, and/or select functions/fields from drop-down boxes to create fancy chart. No need to learn Splunk search commands.
=> Manager can impress his/her boss
4. Directors/CEO boss will feel reports are coming so quicker than before.
Bonus point of Data Model is 100 times faster search speed by making use of Data Model Acceleration feature.
Note: Programmers are still satisfied with his/her owns way to analyze data 🙂
Splunk defines it in data model
As -
-- A hierarchically structured, search-time mapping of semantic knowledge about one or more datasets that encode the domain knowledge necessary to generate specialized searches of those datasets. Splunk Enterprise uses these specialized searches to generate reports and charts for pivot users.
Right, thanks.
But I can't really use that explanation for a new user, or a student, or my mother.
Is there a simpler way of explaining it?