I have an issue on one of my two search head clusters where the column order is reversed when running timechart. For example, if I run a simple search such as index=notable | timechart count, the count and _time fields are displayed but they are reversed so that count is first and _time is second. This is also switching the values of the two axes causing the timechart visualization to fail. As the search is running I can see the _time field is first, but once it completes they switch places.
As I mentioned the other search head cluster works fine. I've tried forcing the order with | table _time count after the timechart command but that doesn't change the order. If I do | bin _time | stats count by _time it fails as well.
Splunk support asked that we remove phased_execution_mode=singlethreaded from the ES cluster, which fixed the timechart issue. I believe this is specified in the known issues under SPL-164718 and SPL-165363.
Unfortunately without this setting our CPU spikes constantly, so although we can't fix it we do know the cause.
Splunk support asked that we remove phased_execution_mode=singlethreaded from the ES cluster, which fixed the timechart issue. I believe this is specified in the known issues under SPL-164718 and SPL-165363.
Unfortunately without this setting our CPU spikes constantly, so although we can't fix it we do know the cause.
Strange - I've just tried index=_internal | timechart count
on my 7.2.1 (_internal because I'm just evaluating it and there is no data on it yet) and it came out normal, _time first, count second.
Just out of curiosity, I then tried this: index=_internal | bucket _time span=10m | stats count by _time
and it also worked - you then choose Timechart as your Visualization and it displays just fine.
Yes, this is what we also see on the working cluster. On the Enterprise Security cluster it reverses the columns and axes.
Is the issue with Splunk Enterprise
or Enterprise Security
?
Splunk Enterprise as far as I can tell since it occurs in any app on the SH cluster, however the cluster that is affected also runs Enterprise Security.
probably not too helpful, but a few things i'd probably try to see if can figure out any patterns - not looking for you to answer these questions here, just trying provide some thoughts...
does that happen with any base search? with any aggregate? does it matter if you rename the aggregate in the timechart command? does the issue persist if you add a by clause to it?
does the behavior exist on a all search heads in that cluster? does it exist across apps on a search head? does it happen with the chart command as well? Even if it's the same splunk version across both SHC's, any recent changes to this SHC that could be related?
does it persist across different browsers? After clearing cache?
I do have vague recollection about something similar but nothing that i can definitely remember/confirm
Good troubleshooting tips, I'll run through them and post any new info. I did find that | timechart count(_raw) as count sometimes works, but it's not reliable.
What is the splunk version? Also try using fields
command.
We're running 7.2.1 across the board. I've tried fields as well as table with no success.