Solved: Search usage of Indexes

TheEggi98 · ‎12-06-2024

Hi there

im currently at a search to get the usage of Indexes, so i have an overview which indexes gets used in searches and which indexes doesnt so i can speak with the usecase owner if the data is still needed and why it doesnt get used.

Thats the current state of the search:

| rest "/services/data/indexes"
| table title totalEventCount frozenTimePeriodInSecs
| dedup title
| append 
    [search index=_audit sourcetype="audittrail" search_id="*" action=search earliest=-24h latest=now

``` Regex Extraction ```
| rex field=search max_match=0 "index\=\s*\"?(?<used_index>\S+)\"?"
| rex field=search max_match=0 "\`(?<used_macro>\S+)\`"
| rex field=search max_match=0 "eventtype\=\s*(?<used_evttype>\S+)"

``` Eventtype resolving ```
| mvexpand used_evttype
| join type=left used_evttype 
    [| rest "/services/saved/eventtypes"
| table title search
    | stats values(search) as search by title
| rename search as resolved_eventtype, title as used_evttype]
| rex field=resolved_eventtype max_match=0 "eventtype\=\s*(?<nested_eventtype>\S+)"
| mvexpand nested_eventtype
| join type=left nested_eventtype 
    [| rest "/services/saved/eventtypes"
| table title search
| stats values(search) as search by title
| rename search as resolved_nested_eventtype, title as nested_eventtype]

``` Macro resolving ```
| mvexpand used_macro
| join type=left used_macro 
    [| rest "/servicesNS/-/-/admin/macros" count=0
| table title definition
| stats values(definition) as definition by title
| rename definition as resolved_macro, title as used_macro]
| rex field=resolved_macro max_match=0 "\`(?<nested_macro>[^\`]+)\`"
| mvexpand nested_macro
| join type=left nested_macro 
    [| rest "/servicesNS/-/-/admin/macros" count=0
| table title definition
| stats values(definition) as definition by title
| rename definition as resolved_nested_macro, title as nested_macro]
| where like(resolved_nested_macro,"%index=%") OR isnull(resolved_nested_macro)

``` merge resolved stuff into one field ```
| foreach used* nested* 
    [eval datasrc=mvdedup(if(<<FIELD>>!="",mvappend(datasrc, "<<FIELD>>"),datasrc))]
| eval datasrc=mvfilter(!match(datasrc, "usedData"))
| eval usedData = mvappend(used_index, if(!isnull(resolved_nested_eventtype),resolved_nested_eventtype, resolved_eventtype), if(!isnull(resolved_nested_macro),resolved_nested_macro, resolved_macro))
| eval usedData = mvdedup(usedData)
| table app user action info search_id usedData datasrc
| mvexpand usedData
| eval usedData=replace(usedData, "\)","")
| where !like(usedData, "`%`") AND !isnull(usedData)
| rex field=usedData "index\=\s*\"?(?<usedData>[^\s\"]+)\"?"
| eval usedData=replace(usedData, "\"","")
| eval usedData=replace(usedData,"'","")
| stats count by usedData
    ]

The search first gets the indexes via | rest with its eventcount and retentiontime.
Then audittrail data gets appended and used Indexes, Macros and Eventtypes gets extracted from the searchstring and resolved (since some apps uses nested eventtypes/macros in my environment they get resolved twice). Still needs some sanitizing of the extracted used-indexes.

that gives me a table like this (limited the table to splunkinternal indexes as example)

title	totalEventCount	frozenTimePeriodInSecs	count	usedData
_audit	771404957	188697600
_configtracker	717	2592000
_dsappevent	240	5184000
_dsclient	232	5184000
_dsphonehome	843820	604800
_internal	7039169453	15552000
_introspection	39100728	1209600
_telemetry	55990	63072000
_thefishbucket	0	2419200
			22309	_*
			1039	_audit
			2	_configtracker
			1340	_dsappevent
			1017	_dsclient
			1	_dsclient]
			709	_dsphonehome
			2089	_internal
			117	_introspection
			2	_metrics
			2	_metrics_rollup
			2	_telemetry
			2	_thefishbucket

But i didnt managed to merge the rows together so that i have count=1039 for _audit plus the 22309 from searches that uses all internal indexes in one row for each index.

bowesmana · ‎12-09-2024

You can generally get there with SPL, e.g. here's a bit of a hack, which has a stab at it based on your data example

| makeresults format=csv data="title,totalEventCount,frozenTimePeriodInSecs,count,usedData
_audit,771404957,188697600, , 
_configtracker,717,2592000, , 
_dsappevent,240,5184000, , 
_dsclient,232,5184000, , 
_dsphonehome,843820,604800, , 
_internal,7039169453,15552000, , 
_introspection,39100728,1209600, , 
_telemetry,55990,63072000, , 
_thefishbucket,0,2419200, , 
 , , ,22309,_*
 , , ,1039,_audit
 , , ,2,_configtracker
 , , ,1340,_dsappevent
 , , ,1017,_dsclient
 , , ,1,_dsclient]
 , , ,709,_dsphonehome
 , , ,2089,_internal
 , , ,117,_introspection
 , , ,2,_metrics
 , , ,2,_metrics_rollup
 , , ,2,_telemetry
 , , ,2,_thefishbucket"
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
| eventstats values(eval(if(match(title, "\*"), title."##".title."##".count, null()))) as wildcard_indexes
| eval wildcard_indexes=mvmap(wildcard_indexes, replace(wildcard_indexes, "\*(.*##)?", ".*\1"))
| eval count=count+sum(mvmap(wildcard_indexes, if(match(title, mvindex(split(wildcard_indexes, "##"), 0)) AND title!=mvindex(split(wildcard_indexes, "##"), 1), mvindex(split(wildcard_indexes, "##"), 2), 0)))
| fields - wildcard_indexes

View solution in original post

PickleRick · ‎12-09-2024

You can't replace docs and management with tools.

[ | makeresults annotate=f
| eval t1="ind", t2="ex", t3=t1.t2 
| eval {t3}="_internal" 
| table *
| fields - t1 t2 t3 _time ] 
| stats count by index

TheEggi98 · ‎12-09-2024

What do you mean with that?
i didnt meant to ask my question in a way that i would want to replace docs and management with tools.

PickleRick · ‎12-09-2024

That was just a friendly reminder that while "tools" like yours can find some typical cases there might be a lot of them which you might miss with them. As long as you are aware of it and you're using it only as means of a quick help, that's fine and dandy. But there are often questions around here "how to find all XXX defined/used by ...". For which the usual answer is - there is no 100% sure way to do so.

bowesmana · ‎12-08-2024

You can do this at the end

| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title

Note that you seem to pull in a bunch of macros that do not contain any index searches

TheEggi98 · ‎12-08-2024

Sorry i probably didnt expressed myself well.

i want that wildcards gets taken into account.

So based from the table i posted as example i would want results like this

title	totalEventCount	frozenTimePeriodInSecs	NumOfSearches
_audit	771404957	188697600	23348 (_audit + _)*
_configtracker	717	2592000	22311 (_configtracker + _)*
_internal	7039169453	15552000	24098 (_internal + _)*

bowesmana · ‎12-09-2024

You can generally get there with SPL, e.g. here's a bit of a hack, which has a stab at it based on your data example

| makeresults format=csv data="title,totalEventCount,frozenTimePeriodInSecs,count,usedData
_audit,771404957,188697600, , 
_configtracker,717,2592000, , 
_dsappevent,240,5184000, , 
_dsclient,232,5184000, , 
_dsphonehome,843820,604800, , 
_internal,7039169453,15552000, , 
_introspection,39100728,1209600, , 
_telemetry,55990,63072000, , 
_thefishbucket,0,2419200, , 
 , , ,22309,_*
 , , ,1039,_audit
 , , ,2,_configtracker
 , , ,1340,_dsappevent
 , , ,1017,_dsclient
 , , ,1,_dsclient]
 , , ,709,_dsphonehome
 , , ,2089,_internal
 , , ,117,_introspection
 , , ,2,_metrics
 , , ,2,_metrics_rollup
 , , ,2,_telemetry
 , , ,2,_thefishbucket"
| eval title=coalesce(title, usedData)
| fields - usedData
| stats values(*) as * by title
| eventstats values(eval(if(match(title, "\*"), title."##".title."##".count, null()))) as wildcard_indexes
| eval wildcard_indexes=mvmap(wildcard_indexes, replace(wildcard_indexes, "\*(.*##)?", ".*\1"))
| eval count=count+sum(mvmap(wildcard_indexes, if(match(title, mvindex(split(wildcard_indexes, "##"), 0)) AND title!=mvindex(split(wildcard_indexes, "##"), 1), mvindex(split(wildcard_indexes, "##"), 2), 0)))
| fields - wildcard_indexes

TheEggi98 · ‎12-10-2024

This is the SPL Magic i was missing 🙂
Now i can have a basic understanding which indexes might be searched less frequently than others

Search usage of Indexes

Splunk App for Anomaly Detection End of Life Announcement

Transforming Financial Data into Fraud Intelligence

How to send events & findings from AWS to Splunk using Amazon EventBridge

Are you a member of the Splunk Community?

Search usage of Indexes

Splunk App for Anomaly Detection End of Life Announcement

Transforming Financial Data into Fraud Intelligence

How to send events & findings from AWS to Splunk using Amazon EventBridge