I am using Splunk Enterprise 6.6.2, and today I noticed an alarming problem.
In order for me to troubleshoot the problem, I created a bare-bone version of my dashboard:
<form>
<label>Quotation View v1 Clone</label>
<search id="qv">
<query>index=summary_price source=summary-price-quotation-view | fields count
</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<fieldset submitButton="false" autoRun="true">
<input type="time" token="time" searchWhenChanged="true">
<label>Time</label>
<default>
<earliest>-1d@d</earliest>
<latest>@d</latest>
</default>
</input>
</fieldset>
<row>
<panel>
<table>
<title>Normal Search</title>
<search>
<query>index=summary_price source=summary-price-quotation-view | stats sum(count) as count
</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
</table>
</panel>
<panel>
<table>
<title>Uses Base Search</title>
<search base="qv">
<query>| stats sum(count) as count
</query>
</search>
</table>
</panel>
</row>
</form>
The first table doesn't use the base search, while the second table uses the base search. Both searches are identical at the end. However, they gives different result (2,526,053 vs. 2,086,762), as shown in the attached image, and the difference is huge!
What can explain the difference? Is it a bug in Splunk?
try this : count is your field? or you are using splunk stats count option?
<form>
<label>Quotation View v1 Clone</label>
<search id="qv">
<query>index=summary_price source=summary-price-quotation-view
</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<fieldset submitButton="false" autoRun="true">
<input type="time" token="time" searchWhenChanged="true">
<label>Time</label>
<default>
<earliest>-1d@d</earliest>
<latest>@d</latest>
</default>
</input>
</fieldset>
<row>
<panel>
<table>
<title>Normal Search</title>
<search>
<query>index=summary_price source=summary-price-quotation-view | stats count</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<option name="refresh.display">progressbar</option>
</table>
</panel>
<panel>
<table>
<title>Uses Base Search</title>
<search base="qv">
<query>| stats count</query>
</search>
<option name="refresh.display">progressbar</option>
</table>
</panel>
</row>
</form>
Thanks to mayurr98, the root problem is found. It's related to limits.conf
http://docs.splunk.com/Documentation/Splunk/latest/Viz/Savedsearches#Post-process_searches_2
look for post process search and max_count setting in limits.conf
Yes, your issue was due to a settings defined in limits.conf, but increasing that value only serves to mask the true root cause, which is that your base search did not use a transforming command. Consider this advice from Splunk's documentation:
Best practices
Use these best practices to make sure that post-process searches work as expected.
Use a transforming base search
A base search should be a transforming search that returns results formatted as a statistics table.
Non-transforming base search issues
Non-transforming base searches can cause the following search result and timeout issues. If you observe these issues in a dashboard, check the base search to make sure that it is a transforming search.
Event retention
If the base search is a non-transforming search, the Splunk platform retains only the first 500,000 events that it returns. A post-process search does not process events in excess of this 500,000 event limit, silently ignoring them. This can generate incomplete data for the post-process search.
This search result retention limit matches the max_count setting in limits.conf. The setting defaults to 500,000.
Client timeout
If the post-processing operation takes too long, it can exceed the Splunk Web client timeout value of 30 seconds.
For more information about transforming searches, see transforming commands and searches in the Search Manual.
Avoid referencing fields only in post-process searches
In post-process searches, reference fields that are also referenced in the base search. If you are not referencing a particular field in the base search, do not reference it in the post-process search. Fields without a reference in the base search appear null in a post-process search. The post-process search returns no results in this case.
Limit base search results and post-process complexity
Passing a large number of search results to a post-process search can cause server timeout issues. In this scenario, consider adjusting the base search to reduce the number of results and fields that it returns. You can also consider reducing the complexity of post-process operations on the base search results.
@micahkemp Agree. I have edited my previous comment and didn't mention about increasing the limit (and I will leave it that way since the root cause is really about hitting that limit).
And in my real dashboard I actually ended up not using base search at all, as I have 4 tables to populate and each one uses a different set of grouping fields for stats sum(count)
. So I can't use a transformation in the base search that fit all the 4 tables.
Just so I'm not coming across the wrong way, I'm not telling you what to do. 🙂
I just want to make sure it's understood that the limits are in place for a reason, and simply increasing them is likely not a long term solution to the problem at hand.
🙂 -------
As referenced by above comments, there is definitely a limit to the number of results returned by a base search. You should use a reporting function in your base search, unless you know the results will always be under the limit (and even then it's still best to do so). Sometimes it can be tricky to find a common report function that allows all your dependent panels to use the same base search, but you should definitely strive to do so.
Unfortunately in this example the base search will contain all the logic:
index=summary_price source=summary-price-quotation-view | stats count
and the post-process search in the panel will be empty.
@micahkemp Thanks for your advice. The index=summary_price source=summary-price-quotation-view | stats count
you mentioned was actually just a query to help troubleshooting my problem in the original post.
Not an answer, but a side comment: you don't need the leading |
in the search string of the panel that uses the base search.
try this : count is your field? or you are using splunk stats count option?
<form>
<label>Quotation View v1 Clone</label>
<search id="qv">
<query>index=summary_price source=summary-price-quotation-view
</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<fieldset submitButton="false" autoRun="true">
<input type="time" token="time" searchWhenChanged="true">
<label>Time</label>
<default>
<earliest>-1d@d</earliest>
<latest>@d</latest>
</default>
</input>
</fieldset>
<row>
<panel>
<table>
<title>Normal Search</title>
<search>
<query>index=summary_price source=summary-price-quotation-view | stats count</query>
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
</search>
<option name="refresh.display">progressbar</option>
</table>
</panel>
<panel>
<table>
<title>Uses Base Search</title>
<search base="qv">
<query>| stats count</query>
</search>
<option name="refresh.display">progressbar</option>
</table>
</panel>
</row>
</form>
@mayurr98 Thanks for your suggestion. First of all, "count" is a field name. Then I tried your suggestion, and I also got back different results. Normal Search = 691140; while Use-Base-Search=500000, which looks very suspicious. Is there any limit on how many results can be returned by a base search?
yes i think its something related to limits.conf
refer this link
http://docs.splunk.com/Documentation/Splunk/latest/Viz/Savedsearches#Post-process_searches_2
look for post process search
i think you need to change max_count
setting in limits.conf
@mayurr98 That's a good point, and I think that must be the reason. Let me try to modify the base search. Thanks a lot!
BTW, the result given by the non-base-search is right, as verified independently using the raw data.
@patng_nw It seems to be time range issue . So could you please add
<earliest>$time.earliest$</earliest>
<latest>$time.latest$</latest>
to the table "Uses Base Search" and the check
@493669 I just did that, and it makes no difference. Actually, the dashboard editor complained about using those two fields (saying they're not allowed) when your query uses a base search.
And we can also tell from my Jobs page image that the time range of the two searches are the same.