Hi there
im currently at a search to get the usage of Indexes, so i have an overview which indexes gets used in searches and which indexes doesnt so i can speak with the usecase owner if the data is still needed and why it doesnt get used.
Thats the current state of the search:
| rest "/services/data/indexes"
| table title totalEventCount frozenTimePeriodInSecs
| dedup title
| append
[search index=_audit sourcetype="audittrail" search_id="*" action=search earliest=-24h latest=now
``` Regex Extraction ```
| rex field=search max_match=0 "index\=\s*\"?(?<used_index>\S+)\"?"
| rex field=search max_match=0 "\`(?<used_macro>\S+)\`"
| rex field=search max_match=0 "eventtype\=\s*(?<used_evttype>\S+)"
``` Eventtype resolving ```
| mvexpand used_evttype
| join type=left used_evttype
[| rest "/services/saved/eventtypes"
| table title search
| stats values(search) as search by title
| rename search as resolved_eventtype, title as used_evttype]
| rex field=resolved_eventtype max_match=0 "eventtype\=\s*(?<nested_eventtype>\S+)"
| mvexpand nested_eventtype
| join type=left nested_eventtype
[| rest "/services/saved/eventtypes"
| table title search
| stats values(search) as search by title
| rename search as resolved_nested_eventtype, title as nested_eventtype]
``` Macro resolving ```
| mvexpand used_macro
| join type=left used_macro
[| rest "/servicesNS/-/-/admin/macros" count=0
| table title definition
| stats values(definition) as definition by title
| rename definition as resolved_macro, title as used_macro]
| rex field=resolved_macro max_match=0 "\`(?<nested_macro>[^\`]+)\`"
| mvexpand nested_macro
| join type=left nested_macro
[| rest "/servicesNS/-/-/admin/macros" count=0
| table title definition
| stats values(definition) as definition by title
| rename definition as resolved_nested_macro, title as nested_macro]
| where like(resolved_nested_macro,"%index=%") OR isnull(resolved_nested_macro)
``` merge resolved stuff into one field ```
| foreach used* nested*
[eval datasrc=mvdedup(if(<<FIELD>>!="",mvappend(datasrc, "<<FIELD>>"),datasrc))]
| eval datasrc=mvfilter(!match(datasrc, "usedData"))
| eval usedData = mvappend(used_index, if(!isnull(resolved_nested_eventtype),resolved_nested_eventtype, resolved_eventtype), if(!isnull(resolved_nested_macro),resolved_nested_macro, resolved_macro))
| eval usedData = mvdedup(usedData)
| table app user action info search_id usedData datasrc
| mvexpand usedData
| eval usedData=replace(usedData, "\)","")
| where !like(usedData, "`%`") AND !isnull(usedData)
| rex field=usedData "index\=\s*\"?(?<usedData>[^\s\"]+)\"?"
| eval usedData=replace(usedData, "\"","")
| eval usedData=replace(usedData,"'","")
| stats count by usedData
]
The search first gets the indexes via | rest with its eventcount and retentiontime.
Then audittrail data gets appended and used Indexes, Macros and Eventtypes gets extracted from the searchstring and resolved (since some apps uses nested eventtypes/macros in my environment they get resolved twice). Still needs some sanitizing of the extracted used-indexes.
that gives me a table like this (limited the table to splunkinternal indexes as example)
title | totalEventCount | frozenTimePeriodInSecs | count | usedData |
_audit | 771404957 | 188697600 | ||
_configtracker | 717 | 2592000 | ||
_dsappevent | 240 | 5184000 | ||
_dsclient | 232 | 5184000 | ||
_dsphonehome | 843820 | 604800 | ||
_internal | 7039169453 | 15552000 | ||
_introspection | 39100728 | 1209600 | ||
_telemetry | 55990 | 63072000 | ||
_thefishbucket | 0 | 2419200 | ||
22309 | _* | |||
1039 | _audit | |||
2 | _configtracker | |||
1340 | _dsappevent | |||
1017 | _dsclient | |||
1 | _dsclient] | |||
709 | _dsphonehome | |||
2089 | _internal | |||
117 | _introspection | |||
2 | _metrics | |||
2 | _metrics_rollup | |||
2 | _telemetry | |||
2 | _thefishbucket |
But i didnt managed to merge the rows together so that i have count=1039 for _audit plus the 22309 from searches that uses all internal indexes in one row for each index.
You can generally get there with SPL, e.g. here's a bit of a hack, which has a stab at it based on your data example
| makeresults format=csv data="title,totalEventCount,frozenTimePeriodInSecs,count,usedData
_audit,771404957,188697600, ,
_configtracker,717,2592000, ,
_dsappevent,240,5184000, ,
_dsclient,232,5184000, ,
_dsphonehome,843820,604800, ,
_internal,7039169453,15552000, ,
_introspection,39100728,1209600, ,
_telemetry,55990,63072000, ,
_thefishbucket,0,2419200, ,
, , ,22309,_*
, , ,1039,_audit
, , ,2,_configtracker
, , ,1340,_dsappevent
, , ,1017,_dsclient
, , ,1,_dsclient]
, , ,709,_dsphonehome
, , ,2089,_internal
, , ,117,_introspection
, , ,2,_metrics
, , ,2,_metrics_rollup
, , ,2,_telemetry
, , ,2,_thefishbucket"
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
| eventstats values(eval(if(match(title, "\*"), title."##".title."##".count, null()))) as wildcard_indexes
| eval wildcard_indexes=mvmap(wildcard_indexes, replace(wildcard_indexes, "\*(.*##)?", ".*\1"))
| eval count=count+sum(mvmap(wildcard_indexes, if(match(title, mvindex(split(wildcard_indexes, "##"), 0)) AND title!=mvindex(split(wildcard_indexes, "##"), 1), mvindex(split(wildcard_indexes, "##"), 2), 0)))
| fields - wildcard_indexes
You can't replace docs and management with tools.
[ | makeresults annotate=f
| eval t1="ind", t2="ex", t3=t1.t2
| eval {t3}="_internal"
| table *
| fields - t1 t2 t3 _time ]
| stats count by index
What do you mean with that?
i didnt meant to ask my question in a way that i would want to replace docs and management with tools.
That was just a friendly reminder that while "tools" like yours can find some typical cases there might be a lot of them which you might miss with them. As long as you are aware of it and you're using it only as means of a quick help, that's fine and dandy. But there are often questions around here "how to find all XXX defined/used by ...". For which the usual answer is - there is no 100% sure way to do so.
You can do this at the end
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
Note that you seem to pull in a bunch of macros that do not contain any index searches
Sorry i probably didnt expressed myself well.
i want that wildcards gets taken into account.
So based from the table i posted as example i would want results like this
title | totalEventCount | frozenTimePeriodInSecs | NumOfSearches |
_audit | 771404957 | 188697600 | 23348 (_audit + _*) |
_configtracker | 717 | 2592000 | 22311 (_configtracker + _*) |
_internal | 7039169453 | 15552000 | 24098 (_internal + _*) |
You can generally get there with SPL, e.g. here's a bit of a hack, which has a stab at it based on your data example
| makeresults format=csv data="title,totalEventCount,frozenTimePeriodInSecs,count,usedData
_audit,771404957,188697600, ,
_configtracker,717,2592000, ,
_dsappevent,240,5184000, ,
_dsclient,232,5184000, ,
_dsphonehome,843820,604800, ,
_internal,7039169453,15552000, ,
_introspection,39100728,1209600, ,
_telemetry,55990,63072000, ,
_thefishbucket,0,2419200, ,
, , ,22309,_*
, , ,1039,_audit
, , ,2,_configtracker
, , ,1340,_dsappevent
, , ,1017,_dsclient
, , ,1,_dsclient]
, , ,709,_dsphonehome
, , ,2089,_internal
, , ,117,_introspection
, , ,2,_metrics
, , ,2,_metrics_rollup
, , ,2,_telemetry
, , ,2,_thefishbucket"
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
| eventstats values(eval(if(match(title, "\*"), title."##".title."##".count, null()))) as wildcard_indexes
| eval wildcard_indexes=mvmap(wildcard_indexes, replace(wildcard_indexes, "\*(.*##)?", ".*\1"))
| eval count=count+sum(mvmap(wildcard_indexes, if(match(title, mvindex(split(wildcard_indexes, "##"), 0)) AND title!=mvindex(split(wildcard_indexes, "##"), 1), mvindex(split(wildcard_indexes, "##"), 2), 0)))
| fields - wildcard_indexes
This is the SPL Magic i was missing 🙂
Now i can have a basic understanding which indexes might be searched less frequently than others