Hi everyone,
silly question but I'm not much practical with Splunk queries. How to speed up a search that is currently taking around half a minute for just a few hundreds of hints? This is the code:
index=* cs_stage=IT cs_component_id=*mynab.nab.wesit.rowini.net* message="*sCB\=200*" AND message="*sCF\=200*" AND reqF="*/rewards/c/d/*" | convert timeformat="%Y-%m-%d" ctime(_time) AS date | stats count by date
That nobody else mentioned this makes me wonder if I'm the one missing something, but do you really need to search all indexes that you have access to (index=*)?
Another basic one that could be easy to overlook is what is the time range you're searching? A search against All time or last 30 days, especially when searching against all indexes) will take much longer than if you really only need data from the last 24 hours.
You're right but in this case the index is not even relevant, there's only one. About time range I use to select it from the UI time picker, isn't the same?
Sounds like you're good. UI time picker should be the same as manually specifying a time in the search string.
Try to avoid searches like foo=*bar* as those basically needs check every events in your index(es).
Best optimization is tight time frame then use specific indexes, hosts, source and sourcetypes. Also avoid to use verbose mode.
There are lot of .conf presentations how to optimize searches.
r. Ismo
Thanks for your answer. In my case I need to check for all successful login, so the point is more about filtering the message that the time frame. I'll check you URLs anyway.
As @rnowitzki proposed, could you share some events to us, so we can figure out which is the best method to speed up your query?
Hi @paxo ,
The search takes so much time, because you use a lot of wildcards.
Try to avoid them, especially avoid them at the beginning of a string.
https://docs.splunk.com/Documentation/SCS/current/Search/Wildcards
"The more specific your search terms are, the more efficient your search is."
Thanks for the hint. The problem is that in this case I have a field message with so many values inside and the one I'm interested are in the middle, like:
message="[...]sCF=200[...]sCB=200[...]"
How can I achieve that result without wildcards?
You could extract fields from the sCF=200, sCB=200 parts of the message.
And/Or try to work with regular expressions - compare search speed with the wildcard option.
https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchReference/Rex
Can you maybe share one or a few events as examples? Remove or change sensitive data if there is any.
Thank you all. This is a line with some sensitive data masked:
2020 07 14 10:40:46.866 isi3web NProxyOp x.xxx.x-x-x-x-x.x.x.x.x.x:443 6-INFO : reqF="GET xxx" reqDecF=<NULL> ipF=xxx sCF=200 bSF=568 dTF=68 reqB="GET xxx" adrB=xxx ipB=x.x.x.x sCB=200 dTB=60 dTcB=0 dTsB=1 dTr1B=58 dTr2B=1 dTFrs=0 invS=/* (CustomErrorPages-100) /* (SecurityBaseline-200) (RequestHeaderValidationLength-200) [...]
Hi @paxo,
Isn't Splunk recognizing all the key/value pairs as fields already? I would assume so.
Can you search in verbose mode and open one of the events and make a screenshot (you could erase the sensitive date with Paint or something). And/or show the fields list on the left side.
Oh yes I've just noticed it does recognize the fields. Just one thing: I can't find a way to correctly remove wildcards on reqF field. I have a lot of different paths like: /x/x/x/x/y where x is constant and y is a random string with variable length, how can I remove the reqF=/x/x/x/*? Also I have another field with two possible values like: x.x.x.x-A and x.x.x.x-B, is there any way to insert a regex like x.x.x.x-[A,B]?
Hi @paxo,
Difficult to help without the real data. Would need to know what kind of values "x" and "y" can have.
But here is an example that should help you setting it up. You can put it as-is in your search box.
| noop
| makeresults
| eval reqF="/x/x/x/x/y"
| rex field=reqF "\/x\/x\/x\/x\/(?<reqF_extract>\w)"
It extracts "y" from the reqF field as a new field called reqF_extract.
Thanks I will go for that solution in that field. Just last question: I also have a field like this:
15-Jul-2020 08:43:26 [INFO ] [CC:9844566928342GkldsbtTOO1my/wpnDQP8g7J1266Q=] [RC:fasas-9a4a-39a92834-3212523112sa-00000124] [10.240.134.165] [AUDIT] [SHOW_PAGE] jspName=LP1 (Login.jsp) Agent=[Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.4103.97 Safari/538.46]
I need to group by CC like this: "stats count by CC" which obviously doesn't work.
So you want to count the events by specific Value after "CC:"?
Yes I that field is unique for user and I need to group by user. If I search "CC:something" Splunk recognizes the filed but if I try with "| stats count by CC" it doesn't work.