Splunk Search

Chart only showing 1000 events

herbie
Path Finder

Hi, I'm trying to create a chart of results over time, however the chart only charts the first 1000 results. I'm using the following search over 1 day:

index=prod sourcetype="websphere:nativestdouterrlog" | chart avg(exclusiveaccessms) over _time

This returns about 6500 results across the day, and I need to create a line chart from that. I can see all the results in the table, however the chart stops at the 1000th event.

I've also tried using the table & timechart functions, but both have the same problem.

Is there a limit somewhere which I can change to correct this?

Thanks Ashley

1 Solution

sideview
SplunkTrust
SplunkTrust

index=prod sourcetype="websphere:nativestdouterrlog" | chart avg(exclusiveaccessms) over _time

will only graph 10,000 rows. But the problem here is that chart by itself will not do any bucketing by time. So if you look at the rows, the timestamps are the same timestamps from the events -- these rows are just events really. So all that's happening is the default limit on the chart command is kicking in (and this is a good thing). (If for some reason you really wanted to use the chart command or stats command over _time yourself instead of just using timechart, you'll have to manually bucket the _time values with the bucket command.)

What you want to run is:

index=prod sourcetype="websphere:nativestdouterrlog" | timechart avg(exclusiveaccessms)

I'm not sure what problem you were running into when you tried it, but that will work fine and work properly well past 10,000 rows and up into millions of events and beyond. Granted timechart will only return a number of rows far less than the number of events, but that's the point - timechart buckets millions of events into aggregated buckets of time and then graphs the aggregate statistics, not the raw events.

it is possible but not likely that your etc/system/local/limits.conf to see if someone set an obscure limit on timechart or on the whole system somehow, but this would have been a deliberate action taken by some admin in your deployment.

UPDATE:

I still strongly recommend some form of approach where you bucket the data. Graphing 100,000 rows in the flash chart just isnt a story that ends well.

1) per my comment below, you might want to explore using max(exclusiveaccessms) in addition to avg(axclusiveaccessms)

2) You might want to explore the span and bins arguments to timechart, because you can use this to increase the granularity of the timechart's buckets to whatever you need. For instance:

<your search> | timechart span=1h min(exclusiveaccessms) avg(exclusiveaccessms) max(exclusiveaccessms) 

Will give you min, avg and max for every hour. And this example:

<your search> | timechart bins=1000 min(exclusiveaccessms) avg(exclusiveaccessms) max(exclusiveaccessms) 

will make much more granular buckets (i think the default is around 250 or 300), but it will still determine the exact number of buckets from the timerange it's given.

View solution in original post

yannK
Splunk Employee
Splunk Employee

Method to bump the value on a chart basis for simpleXML on 6.2

< option name="charting.data.count" >9999 </ option >

see http://docs.splunk.com/Documentation/Splunk/6.2.1/AdvancedDev/AdvChartingConfig-LayoutData#Timeline_...

herbie
Path Finder

I've sort of found a solution. If you are using advanced XML to create the dashboard/chart, there is a param called 'maxResultCount' which tells it how many results can be plotted per series. This sits under the FlashChart module. The strange thing is that the Splunk documentation on this says the default is 250, not 1000.

Be careful though, the documentation says changing it can cause unexpected UI behaviour, and I've noticed that when you use it with a large number your browser starts using heaps of memory (I had it up to 500MB, and it wasn't responding very well).

Here's an example of the module in my XML, you can see where the maxResultCount parameter sits:

<module name="HiddenPostProcess" layoutPanel="panel_row1_col1" group="JVM Heap Usage" autoRun="True">
    <param name="search">table _time T_Total T_Used N_Total N_Used</param>
    <param name="groupLabel">JVM Heap Usage</param>
    <module name="ViewstateAdapter">
        <module name="JobProgressIndicator">
            <module name="EnablePreview">
                <param name="enable">True</param>
                <param name="display">False</param>
                <module name="HiddenChartFormatter">
                    <param name="charting.chart">line</param>
                    <param name="charting.axisTitleX.text"></param>
                    <param name="charting.axisTitleY.text"></param>
                    <param name="charting.chart.nullValueMode">connect</param>
                    <module name="FlashChart">
                        <param name="width">100%</param>
                        <param name="height">400px</param>
                        <param name="maxResultCount">10000</param>
                    </module>
                    <module name="ViewRedirectorLink">
                        <param name="viewTarget">flashtimeline</param>
                    </module>
                </module>
            </module>
        </module>
    </module>
</module>

sideview
SplunkTrust
SplunkTrust

Indeed, raising this to 10000 rows is a bad idea, and you're not really getting anything in return; there are not 10,000 pixels in your display so the herculean effort of pulling down all the data and charting it in Flash is wasted. I still recommend allowing timechart to bucket the times, just use min/max/percentiles to better effect (see my answer here for more details)

0 Karma

herbie
Path Finder

No, I didn't need to restart anything, just update the dashboard through the Manager and it should pick it up straight away. Not sure why it's not working for you???

0 Karma

alex_exe_
Explorer

Hi again, i still don't get the results. when changing the value the charts are still showing only 1000 results. 😞

No restart is needed when changing this?

Thanks in advance, Alex

0 Karma

alex_exe_
Explorer

Hi, i will try it and get back with the result. But the heaps kind of scare.

Thanks for the update.

0 Karma

alex_exe_
Explorer

Hi Ashley i'm currently with the same problem. Can't show more than 1000 events on the chart... for istance if i want to show 30 minutes by second i get 30X60= 1800.

It will only show 1000 events... 😞

Any new ideas on this?

herbie
Path Finder

Hey Alex, I've sort of found a solution. If you are using advanced XML to create the dashboard/chart, there is a param called 'maxResultCount' which tells it how many results can be plotted per series. This sits under the FlashChart module. The strange thing is that the Splunk documentation on this says the default is 250, not 1000.
Be careful though, the documentation says changing it can cause unexpected UI behaviour, and I've noticed that when you use it with a large number your browser starts using heaps of memory (I had it up to 500MB, and it wasn't responding very well).

0 Karma

alex_exe_
Explorer

Hi Ashley i'm still investigating but it seems the only way to do it, is somehow changing that limitation of 1000.

But if i find an answer i'l post ir here.

Cheers,
Alex

0 Karma

herbie
Path Finder

Hi Alex, No I haven't been able to get around this, it's still a problem for me.

I've had to resort to using the 'timechart bins=1000' to try to get as close as possible, but even that isn't great because it only really does rounded numbers and nothing in between (ie, it goes 1m, 5m, 30m, 1h, 1d), and won't do anything between 1h & 1d.

My dashboards have a TimeRangePicker on them so the users can select their own timerange, but because of this the scale and look of the chart changes dramatically.

If you find anything, please let me know.

Cheers,
Ashley

sideview
SplunkTrust
SplunkTrust

index=prod sourcetype="websphere:nativestdouterrlog" | chart avg(exclusiveaccessms) over _time

will only graph 10,000 rows. But the problem here is that chart by itself will not do any bucketing by time. So if you look at the rows, the timestamps are the same timestamps from the events -- these rows are just events really. So all that's happening is the default limit on the chart command is kicking in (and this is a good thing). (If for some reason you really wanted to use the chart command or stats command over _time yourself instead of just using timechart, you'll have to manually bucket the _time values with the bucket command.)

What you want to run is:

index=prod sourcetype="websphere:nativestdouterrlog" | timechart avg(exclusiveaccessms)

I'm not sure what problem you were running into when you tried it, but that will work fine and work properly well past 10,000 rows and up into millions of events and beyond. Granted timechart will only return a number of rows far less than the number of events, but that's the point - timechart buckets millions of events into aggregated buckets of time and then graphs the aggregate statistics, not the raw events.

it is possible but not likely that your etc/system/local/limits.conf to see if someone set an obscure limit on timechart or on the whole system somehow, but this would have been a deliberate action taken by some admin in your deployment.

UPDATE:

I still strongly recommend some form of approach where you bucket the data. Graphing 100,000 rows in the flash chart just isnt a story that ends well.

1) per my comment below, you might want to explore using max(exclusiveaccessms) in addition to avg(axclusiveaccessms)

2) You might want to explore the span and bins arguments to timechart, because you can use this to increase the granularity of the timechart's buckets to whatever you need. For instance:

<your search> | timechart span=1h min(exclusiveaccessms) avg(exclusiveaccessms) max(exclusiveaccessms) 

Will give you min, avg and max for every hour. And this example:

<your search> | timechart bins=1000 min(exclusiveaccessms) avg(exclusiveaccessms) max(exclusiveaccessms) 

will make much more granular buckets (i think the default is around 250 or 300), but it will still determine the exact number of buckets from the timerange it's given.

sideview
SplunkTrust
SplunkTrust

sounds like you want to graph timechart avg(exclusiveaccessms) max(exclusiveaccessms) then?
Timechart can graph several different stats on the same access. I find myself commonly doing timechart min(foo) avg(foo) max(foo), which makes for a little visualization. or timechart min(foo) perc33(foo) perc67(foo) max(foo) and so on and so forth...

0 Karma

herbie
Path Finder

Hi Nick, thanks for your response. I understand that this command is returning all the events, as I don't actually want to average them. I'm trying to make graphs of JVM memory usage and we need to be able to accurately see when there's spikes, so when we show the graph over a longer period (ie 7 days) the averaged data becomes invalid.
I don't think the limit is to do with the chart/timechart commands themselves, because I get the same result if I use the table function. When I perform the search I can see all the rows being returned, it's just that the flash chart doesn't display them all.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...