We have an external database. The database is emptied & re-populate daily. The content is 80% similiar as the previous database. There is no uniq index field saying the 1st 80% are old data & the last 20% of data are new incoming data. The total records are 6 million.
We use Splunk DB Connect to bring in the data. Over time, the total records become 50 million. 8x more data. The simple search like index="XXX" is taking longer than dbxquery and saved search.
We observed using stats can improve the stat command for index="XXX". But, we still need to further improve the query turnaround time. Hence we are thinking of substituting the index="XXX" with dbxquery or saved_search to further boost the stats query. We tried with 3 syntaxes attached. Observed "no result found" when substituting index="XXX" with dbxquery & saved_searched. Can we learn the right syntax to put tstats & dbxquery or saved_searched together, please?
Maybe. It depends on how you want to use them and what the end goal is. Tell us more about your use case.
tstats and dbxquery are both generating commands so there are limits to how they can be used together. Either command can be in a subsearch or tstats can use the append=t option.
## This one work
| tstats count where (index="XXX" ) groupby date_time
| where date_time like "2022%"
| fields date_time
| dedup date_time
# This one not working
| tstats count where (loadjob savedsearch="XXX:YYY") groupby date_time
| where date_time like "2022%"
| fields date_time
| dedup date_time
# This one not working
| tstats count where (dbxquery XXX) groupby date_time
| where date_time like "2022%"
| fields date_time
| dedup date_time
The first command works because it uses correct syntax.
The others attempt to replace a set of key=value pairs with a command that returns who knows what. Per the Search Reference Manual:
WHERE clauses in tstat searches must contain field-value pairs that are indexed, as well as characters that are not major breakers or minor breakers.
If you can get the loadjob or dbxquery command to return a list of key=value pairs then it might work. Keep in mind that both are generating commands so they must be prefixed by |. You may need to use a subsearch containing a format command.
| tstats count where date_time="2022*" ( [ | loadjob savedsearch="XXX:YYY" | format ] ) groupby date_time
| fields date_time
Notice how the where clause moved into the tstats command for better performance and the redundant dedup command was removed (the groupby option removes duplicates).