This is a place to discuss all things outside of Splunk, its products, and its use cases.

How do you maintain your "Data Dictionary" of what you're Splunk-ing?


This isn't so much a technical question about Splunk, but an organizational and process one.

We have a lot of users from a lot of different groups and internal business units using Splunk. They each use it for different reasons, and load in different types of data. So far our biggest users have been specialized groups.

As things continue grow, we're finding a lot of scenarios where data sharing would be helpful. Centralized support groups are finding being able to view application log data, performance data, etc from business servers and services.

I'm curious about hearing stories about how others maintain a "Data Dictionary" of sorts of what they have in Splunk.

  • How do you keep track of what your users are using Splunk for?
  • When centralized support groups need insight into specialized data, how do you determine if the data is even in Splunk, and if so, where to start going at it?
Tags (3)

Esteemed Legend

The very ideas behind your question betray a DB-style mindset that is at odds with the fundamental "flow" of Splunk which in essence is: Splunk everything and everyone will figure out what is important on his own. The "schmeatized" mindset says "we need to organize things so that people work efficiently" but the Splunk mindset says, "let's just get it all in there and start asking questions". The reason I say this is because, although I have been the Splunk admin at many large companies, we have never bothered to maintain such a thing. It has always been enough to make GOOD decisions on how to partition the datasets by index and sourcetype. If a good job is done by the Splunk admin and SMEs when the data is onboarded, then a simple |metadata .. OR index=* | fieldsummary and examination of sourcetypes, hosts and fields generally allows people to figure out if what they need is in Splunk or not. The only time we do anything like what you are describing is if we are dealing with a compliance use-case and those are very tricky and generally do not provide any broader usefulness to the organization. We have never kept track of what users are doing but eventually we have performance issues and need to figure out what is hogging resources so this is done on an ad-hoc/as-needed basis. When support needs insight, they almost always know what data source they need to examine so we rely on them to define the need; it is then very easy to say whether it is in Splunk or not.

Let me tell you a story that perhaps will show you what I mean about mindsets. When I first started working with Splunk, we had a very specific request for a dashboard. I onboarded the data, built the field extractions, created the dashboards and did a demo to the customer who LOVED IT .... but, it was just a little bit different than what he needed (even though it was exactly what he asked to be created). So I churned the dashboard and HE LOVED IT .... but he realized now that he needed it to be broken out by departments/region and then it will be perfect. Except it wasn't. I learned VERY quickly, that as the Splunk admin, I will onboard data and build field extractions but for end-users I only create search strings and nothing else. I create a very nice lake full of good fish and I let my users do their own fishing. If they need to know how to use a particular lure, I will educate them, but they do their own fishing. I have never regretted this approach.

One more thing. I don't think this answer and my approach will work unless you have a very global/open approach to your Splunk data. As much as is possible, everybody should be able to access everything. This provides natural opportunities for synergy and safety/security through non-obscurity/transparency. If you cannot Splunk like this, then you will definitely need somebody who knows about and can access everything so that privileges and roles can be modified to access appropriate data as needed. This kind of defeats what I think Splunk really is so I try to avoid such situations.

Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...