Splunk Search

Search usage of Indexes

TheEggi98
Path Finder

Hi there

im currently at a search to get the usage of Indexes, so i have an overview which indexes gets used in searches and which indexes doesnt so i can speak with the usecase owner if the data is still needed and why it doesnt get used.

Thats the current state of the search:

 

| rest "/services/data/indexes"
| table title totalEventCount frozenTimePeriodInSecs
| dedup title
| append 
    [search index=_audit sourcetype="audittrail" search_id="*" action=search earliest=-24h latest=now

``` Regex Extraction ```
| rex field=search max_match=0 "index\=\s*\"?(?<used_index>\S+)\"?"
| rex field=search max_match=0 "\`(?<used_macro>\S+)\`"
| rex field=search max_match=0 "eventtype\=\s*(?<used_evttype>\S+)"

``` Eventtype resolving ```
| mvexpand used_evttype
| join type=left used_evttype 
    [| rest "/services/saved/eventtypes"
| table title search
    | stats values(search) as search by title
| rename search as resolved_eventtype, title as used_evttype]
| rex field=resolved_eventtype max_match=0 "eventtype\=\s*(?<nested_eventtype>\S+)"
| mvexpand nested_eventtype
| join type=left nested_eventtype 
    [| rest "/services/saved/eventtypes"
| table title search
| stats values(search) as search by title
| rename search as resolved_nested_eventtype, title as nested_eventtype]

``` Macro resolving ```
| mvexpand used_macro
| join type=left used_macro 
    [| rest "/servicesNS/-/-/admin/macros" count=0
| table title definition
| stats values(definition) as definition by title
| rename definition as resolved_macro, title as used_macro]
| rex field=resolved_macro max_match=0 "\`(?<nested_macro>[^\`]+)\`"
| mvexpand nested_macro
| join type=left nested_macro 
    [| rest "/servicesNS/-/-/admin/macros" count=0
| table title definition
| stats values(definition) as definition by title
| rename definition as resolved_nested_macro, title as nested_macro]
| where like(resolved_nested_macro,"%index=%") OR isnull(resolved_nested_macro)

``` merge resolved stuff into one field ```
| foreach used* nested* 
    [eval datasrc=mvdedup(if(<<FIELD>>!="",mvappend(datasrc, "<<FIELD>>"),datasrc))]
| eval datasrc=mvfilter(!match(datasrc, "usedData"))
| eval usedData = mvappend(used_index, if(!isnull(resolved_nested_eventtype),resolved_nested_eventtype, resolved_eventtype), if(!isnull(resolved_nested_macro),resolved_nested_macro, resolved_macro))
| eval usedData = mvdedup(usedData)
| table app user action info search_id usedData datasrc
| mvexpand usedData
| eval usedData=replace(usedData, "\)","")
| where !like(usedData, "`%`") AND !isnull(usedData)
| rex field=usedData "index\=\s*\"?(?<usedData>[^\s\"]+)\"?"
| eval usedData=replace(usedData, "\"","")
| eval usedData=replace(usedData,"'","")
| stats count by usedData
    ]

 

 The search first gets the indexes via | rest with its eventcount and retentiontime.
Then audittrail data gets appended and used Indexes, Macros and Eventtypes gets extracted from the searchstring and resolved (since some apps uses nested eventtypes/macros in my environment they get resolved twice). Still needs some sanitizing of the extracted used-indexes.

that gives me a table like this (limited the table to splunkinternal indexes as example)

titletotalEventCountfrozenTimePeriodInSecscountusedData
_audit771404957188697600  
_configtracker7172592000  
_dsappevent2405184000  
_dsclient2325184000  
_dsphonehome843820604800  
_internal703916945315552000  
_introspection391007281209600  
_telemetry5599063072000  
_thefishbucket02419200  
   22309_*
   1039_audit
   2_configtracker
   1340_dsappevent
   1017_dsclient
   1_dsclient]
   709_dsphonehome
   2089_internal
   117_introspection
   2_metrics
   2_metrics_rollup
   2_telemetry
   2_thefishbucket


But i didnt managed to merge the rows together so that i have count=1039 for _audit plus the 22309 from searches that uses all internal indexes  in one row for each index.

Labels (1)
Tags (3)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

You can generally get there with SPL, e.g. here's a bit of a hack, which has a stab at it based on your data example

| makeresults format=csv data="title,totalEventCount,frozenTimePeriodInSecs,count,usedData
_audit,771404957,188697600, , 
_configtracker,717,2592000, , 
_dsappevent,240,5184000, , 
_dsclient,232,5184000, , 
_dsphonehome,843820,604800, , 
_internal,7039169453,15552000, , 
_introspection,39100728,1209600, , 
_telemetry,55990,63072000, , 
_thefishbucket,0,2419200, , 
 , , ,22309,_*
 , , ,1039,_audit
 , , ,2,_configtracker
 , , ,1340,_dsappevent
 , , ,1017,_dsclient
 , , ,1,_dsclient]
 , , ,709,_dsphonehome
 , , ,2089,_internal
 , , ,117,_introspection
 , , ,2,_metrics
 , , ,2,_metrics_rollup
 , , ,2,_telemetry
 , , ,2,_thefishbucket"
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
| eventstats values(eval(if(match(title, "\*"), title."##".title."##".count, null()))) as wildcard_indexes
| eval wildcard_indexes=mvmap(wildcard_indexes, replace(wildcard_indexes, "\*(.*##)?", ".*\1"))
| eval count=count+sum(mvmap(wildcard_indexes, if(match(title, mvindex(split(wildcard_indexes, "##"), 0)) AND title!=mvindex(split(wildcard_indexes, "##"), 1), mvindex(split(wildcard_indexes, "##"), 2), 0)))
| fields - wildcard_indexes

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

You can't replace docs and management with tools.

[ | makeresults annotate=f
| eval t1="ind", t2="ex", t3=t1.t2
| eval {t3}="_internal"
| table *
| fields - t1 t2 t3 _time ]
| stats count by index

 

0 Karma

TheEggi98
Path Finder

What do you mean with that?
i didnt meant to ask my question in a way that i would want to replace docs and management with tools.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That was just a friendly reminder that while "tools" like yours can find some typical cases there might be a lot of them which you might miss with them. As long as you are aware of it and you're using it only as means of a quick help, that's fine and dandy. But there are often questions around here "how to find all XXX defined/used by ...". For which the usual answer is - there is no 100% sure way to do so.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can do this at the end

| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title

Note that you seem to pull in a bunch of macros that do not contain any index searches

0 Karma

TheEggi98
Path Finder

Sorry i probably didnt expressed myself well.

i want that wildcards gets taken into account.

So based from the table i posted as example i would want results like this

titletotalEventCountfrozenTimePeriodInSecsNumOfSearches
_audit77140495718869760023348  (_audit + _*)
_configtracker717259200022311 (_configtracker + _*)
_internal70391694531555200024098 (_internal + _*)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can generally get there with SPL, e.g. here's a bit of a hack, which has a stab at it based on your data example

| makeresults format=csv data="title,totalEventCount,frozenTimePeriodInSecs,count,usedData
_audit,771404957,188697600, , 
_configtracker,717,2592000, , 
_dsappevent,240,5184000, , 
_dsclient,232,5184000, , 
_dsphonehome,843820,604800, , 
_internal,7039169453,15552000, , 
_introspection,39100728,1209600, , 
_telemetry,55990,63072000, , 
_thefishbucket,0,2419200, , 
 , , ,22309,_*
 , , ,1039,_audit
 , , ,2,_configtracker
 , , ,1340,_dsappevent
 , , ,1017,_dsclient
 , , ,1,_dsclient]
 , , ,709,_dsphonehome
 , , ,2089,_internal
 , , ,117,_introspection
 , , ,2,_metrics
 , , ,2,_metrics_rollup
 , , ,2,_telemetry
 , , ,2,_thefishbucket"
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
| eventstats values(eval(if(match(title, "\*"), title."##".title."##".count, null()))) as wildcard_indexes
| eval wildcard_indexes=mvmap(wildcard_indexes, replace(wildcard_indexes, "\*(.*##)?", ".*\1"))
| eval count=count+sum(mvmap(wildcard_indexes, if(match(title, mvindex(split(wildcard_indexes, "##"), 0)) AND title!=mvindex(split(wildcard_indexes, "##"), 1), mvindex(split(wildcard_indexes, "##"), 2), 0)))
| fields - wildcard_indexes

TheEggi98
Path Finder

This is the SPL Magic i was missing 🙂
Now i can have a basic understanding which indexes might be searched less frequently than others

0 Karma
Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...