I'm going through the limits.conf
specs to see what the defaulted fields are and noticed that the default for max values for a field is set to 10,000 before it truncates the rest. We currently have a few fields that have more than 10,000 values in them such as JsessionID and the GUID which is a unique identifier tied to the web service request and response.
So my question is, is this bad practice to extract fields with high values? Will this slow our Verbose searches down if we had these compared to not having these fields?
Hi,
Which command are you referring to? you'll see in the limits.conf.spec the stanza name above the maxvalues key you are looking at. If you're talking about stats, that refers to the function stats values(foo) - the actual row limit is much higher. So running
| stats count(foo) by GUID
won't truncate after 10,000 rows. If your searches are being truncated you will see a warning label on the job inspector in the search UI.
What's your use case for the verbose searches? I usually avoid them - they consume both memory and disk space in the dispatch directory. And they are slow.
Hi,
Which command are you referring to? you'll see in the limits.conf.spec the stanza name above the maxvalues key you are looking at. If you're talking about stats, that refers to the function stats values(foo) - the actual row limit is much higher. So running
| stats count(foo) by GUID
won't truncate after 10,000 rows. If your searches are being truncated you will see a warning label on the job inspector in the search UI.
What's your use case for the verbose searches? I usually avoid them - they consume both memory and disk space in the dispatch directory. And they are slow.
After looking in the limits.conf specs
I must have misread that number, it's actually 100,000 values per field before it truncates. This is much better but still could cause an issue down the road. Can you verify that I'm understanding this correctly?
[anomalousvalue]
maxresultrows = 50000
# maximum number of distinct values for a field
maxvalues = 100000
My manager is sold on the idea that our GUID and JSESSION fields which have tons of values are slowing the searching down if it's put in verbose mode. I guess a good way of testing this would be to search a similar result set with no fields having many values than do another search with a field having many values then compare the search times.
Yes you are understanding that correctly.
Your manager is correct about verbose mode slowing down the search. Why are you using it? There may be a better way to achieve what you are tying to do. It might be helpful if you post your full search as well
So if I specify a timeframe which has around 30,000 values per field then that would NOT slow down the search right?
I know he uses verbose mode to enable field discovery and I've pointed out many times that smart-mode does the same thing and is faster but I think it's habit for him to flip between fast and verbose mode. I do use verbose from time to time if I'm using a ...|stats
command when I need to view the events, but I agree that it's not ideal to use it when smart-mode is available
Can you post the search you are proposing to run? I'm still not 100% clear which commands you want to run and hence what limit you might run up against.
Also, it's always worth checking the job inspector. It will show you where the time is being taken in the queries you are running