When running the following search for a 24hr period it is always being auto-finalized due to disk usage limit of 100MB.
index="app_ABC123" source="/var/abc/appgroup123/logs/app123/stat.log" | stats count as TotalEvents by TxId | sort TotalEvents desc | where TotalEvents > 100
Is there any way for me to optimize the search so that it doesn't hit the limit?
Well. As the message says, you hit your disk quota limit because in some part you simply had too much data to process. It's very likely that it's some non-streaming command.
In your case there are two quite obvious things that can be done.
Firstly, since you only do count over TxId, you can limit your fields processed after the initial search to just that one field. No point dragging the rest of the event further down the pipeline.
And secondly - limit first, sort second. This way you won't have to sort so much data.
index="app_ABC123" source="/var/abc/appgroup123/logs/app123/stat.log"
| fields TxId
| stats count as TotalEvents by TxId
| where TotalEvents>100
| sort TotalEvents desc
Thanks for the suggestions, unfortunately I still hit the limit with this approach.
I created an extract field, but they are also in there like that...
we are running Version:7.3.7.1
So that likely explains it.
Well, there is also a possibility that you have so much data...
Try to run your search separately for few single hours and check the number of results and storage usage.
The search will completely run over a smaller time window, but was hoping to be able to run it for a 24 hr period.
Maybe one more.
Add "TxId = *" to your first line, then it get only events which this field with any value. And if there are a lot of events then add "sort 0 Total..." as sort has event count limit.
Then you can also try with tstats and TERM + PREFIX if those
| tstats count as TotalEvents where index=app_ABC123 source=/var/abc/appgroup123/logs/app123/stat.log TERM(TxId=*) by PREFIX(TxId=)
| where TotalEvents > 100
| sort 0 TotalEvents desc
The last one is definitely the most efficient if you can get it to work (it should).
I just tested those three with _internal and metrics.log (last 24h in my workstation) and results were
4.874 vs 4.715 vs 0.583s
r. Ismo
More about stats with TERM and PREFIX can found from conf presentations PLA1089C and TRU1133B.
Thanks for the suggestion! I tried this, and it ran very fast without errors, but the results returned 0 statistics. I know there are definitely TxId's that have move than 100 events during the search timeframe.