Is there a way I can see how much data is being searched per index?
Eg: for an index, a user has searched 10 GB of data over the last 1 hour in across 15 search queries.
An index has 100 GB of data, but the last 1-day user searched only 100 MB in the search result.
or Index has 100 GB of data, but user searched too often and search a total of 120 GB of data.
I can tell you generally how to do this, but it seems crazy to me and there doesn't seem to be any useful reason to do this. You start by use a REST query to tell you what searches have run recently. Then you query the job details REST API to get access to the search.log
for the job. In there it will tell you the number of events scanned
. You can then get the optimized search
from the same log and hopefully get a sourcetype
from there. Then you can perform a search using avg(len(_raw)
against that sourcetype and multiply this iw the events scanned
number. That is probably the best that you can do it is making many assumptions and not going to be too accurate.
Please execute the following search sentence. Only inaccurate information on this level can be acquired.
I do not know the purpose, but I recommend you to review alternative proposals.
index=_audit action=search search=* sourcetype=audittrail
| rex field=search "index\s*=\s*\"*(?<IndexUsed>[^\s\"]+)"
| search IndexUsed=*
| fillnull value="NA" IndexUsed
| stats count by IndexUsed
Above query doesn't give the size of data searched per index, it just shares count on number time index used.
Purpose is to validate, data stored vs data actually searched. This is to check abuse on capacity utilization.
Any other recommendation will be welcome!
Thanks