Hello Splunk Community,
I am very new to Splunk and was given the following task and could really use some help:
To gather all data that Splunk is collecting and put it in a visually readable format for executives
I have been trying very many things to accomplish this, such as, using Enterprise Security > Audit> Index Audit and Forwarder Audit. Trying to create custom classic dashboards and using Dashboard studio to play around with the data. Nothing seems to give me what I need.
I have also tried the following:
| tstats values(source) as sources ,values(sourcetype) as sourcetype where index=* by host
| lookup dnslookup clienthost as host OUTPUT clientip as src_ip
This method is very resource intensive and provides me with the information I need but the Source and Sourcetypes are incredibly long and make the table not easy to read for executives. Is there another way to do this?
As others already stated - it's a bit vague requirement.
"Gather all data and present it in a readable format" at first glance reads for me as "print all raw events Splunk is receiving" which is kinda impossible for a human to read and a bit pointless too.
If you want to get some aggregate to gather insight what _kinds_ of data and _where from_ Splunk is getting data you'll have to be a bit creative since - as you already noticed, if you simply do an overall tstats with split by source, sourcetype and host, you'll get a load of results but they will also make not much sense. You need to do some "half-manual" filtering like aggregating file sources by path or even overall by sourcetype.
How much of it you have to do will vary depending on your actual data.
In some cases you can simply do some tweaking with SPL, maybe matching some sources to regex, maybe just adding all sources or all hosts by sourcetype... In smaller cases you might just get away with exporting results to CSV and a bit of manual tweaking in Excel to get the reasonable results.
Hi @Cyber_Shinigami ,
in Splunk, the main question is: what do you want to display?
do you want a list of sourcetypes or a list of hosts?
I suppose, but it's only an idea of mine, that an executive, is mainly interested to the kind of main data indexed, so I'd display some grouped informations like the number of different hosts:
| tstats dc(host) AS host_count count where index=* by sourcetype
| sort -count
| head 10
You could also eventually add a lookup that translates the sourcetypes in more comprehensible description: e.g. cp_log -> "CheckPoint Logs" or fgt_logs -> "Fortinet Logs".
Ciao.
Giuseppe
I question the requirement on a few levels.
First, "gather all data" is a huge task. Presumably, your Splunk environment has ingested multiple terabytes of data over time. Gathering it all is impractical.
Second, "visually readable format". It's not only somewhat redundant, but also very vague. How should the data be presented? A text dump of every event ever received by Splunk would comply with the requirement, but probably would not be well received by executives.
Third, this sounds like a typical management directive where those asking don't know what they want.
Push back and ask for more information. What problem are they trying to solve? Do executives really care about (or even understand) indexes and sourcetypes? They probably don't and are more interested in high-level metrics like storage cost trends or number of incidents detected.
I totally understand where you are coming from and what you are saying.
Alas, I think at this point in time management is attempting to understand what Splunk is collecting so that we can better understand what Splunk might be potentially missing (such as, when someone stands up a server and doesn't tell someone). I have broken metrics down by time in a more readable format like (last 30 minutes or 24 hours) to test the SPL queries that I've been attempting.
That is why I have been focused on organizing the data by Host, Sourcetype, Source, and Index so that I could capture everything but understand the resource intensity associated with it. Additionally, I created a dashboard studio that showcases each data point listed above in their own tab, still showcases everything but isn't in one instance or table.