My apologies for such a noob question. I literally got dropped into a Splunk environment and I know little to nothing about it.
I have an index (foo as an example) and I'm told it's based on Oracle audit logs. However, the index was built for us by the Admin and all I get is blank looks when I asked what exactly is IN the index.
So my question is...how can I interrogate the index to find out what is in it?
I ran across these commands :
| metadata type=sourcetypes index="foo"
| metadata type=hosts index="foo"
This is a start, so now I have some sourcetype "keywords" (is that right?) and I can see some hosts. But I suspect that's just the tip of the iceberg as it were given the index itself is pretty darn big.
I'm an Oracle guy and if I wanted to get familiar w/ an Oracle structure I would start w/ looking at the table structures, note the fields in all the tables, get a diagram if one was available. I don't have that option here. I don't have the rights to "manage" the index or even create my own.
So I have an index and no real clue as to what is in it...
I agree with @sjringo but offer this faster query to find the information
| tstats count where index=foo by sourcetype
Splunk doesn't store data in tables so there's no equivalent to a SQL table dump. You can use the fieldsummary command to see what fields are in the index along with their values.
index = foo | fieldsummary
I would say the first thing to look at is what are the different soucetype's in the index ?
index=foo | stats count by sourcetype
Then that will give you some kind of idea of what is being ingested for the index you have ?
Then if the sourcetype is named that it indicated the sourcetype's log's you can then look at the sources ?
index=foo | stats count by sourcetype,source
This would give you an idea of what is in the index ?
TY 4 that...when I run that first command it returns just north of 2.5 million events and 17 statistics. So I see bandwidth, cpu, df, df_metric, exec, interfaces, iostat, lsof, netstat, openPorts, package, protocol, ps, top, uptime, vmstat, and who.
For all of these, the sourcetype = source with one exception. Exec is broken out to 3 .sh files in a splunkforwarder folder structure.
I do not know if this is correct or not. For instance, I discovered there is a fields link within Settings and I can get to Field Alisases, trim the list to "oracle" and I see stuff reporting from Oracle Audit, Oracle Database, Oracle Listener, Oracle Instance, Oracle Session, Oracle SysPerf, etc...
My understanding is the Splunk Index (this is a file?) is used by Splunk in searching for Keywords (are these fields?). Thus, if the index contains ONLY the source / sourcetype information, then I'm gold and I simply need to define what those 17 stats are actually from/ for. However, I also know that cannot be true as I can search on a Host=<something> which is not in that list.
I do hope that makes sense.
I agree with @sjringo but offer this faster query to find the information
| tstats count where index=foo by sourcetype
Splunk doesn't store data in tables so there's no equivalent to a SQL table dump. You can use the fieldsummary command to see what fields are in the index along with their values.
index = foo | fieldsummary
This is much more helpful. Running:
index=<name> | fieldsummary
Gives me 2.4 million+ events and 261 Statistics. I presume then the 261 would be the sum total of disparate fields available to any of my queries. Should that be true then I need only investigate each one to see what the heck they are and figure out if they are of any use.
Not a small task, but I know more now than I did 30 minutes ago. 😕