Solved: Can you improve the performance of the "geostats" ...

charliedgz · ‎01-24-2017

I am trying to speed up a panel on a dashboard that is using geostats, as it is the slowest thing on the dashboard. I want to use a non streaming command and then use geostats from there.

index=someindex sourcetype=somesource | stats count by longitude,latitude,bloodType | geostats latfield=longitude longfield=longitude count by bloodType

The problem is the "count by bloodType" part. I really want to count by bloodType but count the count not the individual event.
Say I had this as a result after the stats command:

lat: -100 long: 10 type:O count:10
lat: -100 long: 10 type:B count:5
lat: -100 long: 10 type: AB count:1

Using geostats, I want the pie chart for that location to show 10/5/1, instead it shows 1/1/1. Does that make sense?

Everything I want for the stats output is good. Geostats doesn't take into account the count it just takes the row as one event. SO if I have 700k events I end up with 70k events in my piecharts because it is not counting the individual events in bloodType. Don't know if my example helped.

I just want to know if what I am trying to do is even possible. I have tried multiple things just haven't been able to accomplish what I want to achieve. I will expand later... just wanted to get the initial question up. Thanks!

DalJeanis · ‎01-24-2017

The problem is that geostats is getting confused between the count field that came out of the stats command and the count it's generating itself. The latter, being more "local", will supersede and hide the former. So, just try this -

index=someindex sourcetype=somesource 
| stats count as bloodTypeCount by longitude,latitude,bloodType 
| geostats latfield=longitude longfield=longitude sum(bloodTypeCount) as count by bloodType

verified with the following test code

| makeresults 
| eval latitude=32, longitude=-96
| eval BloodTrans=mvappend("A", "B", "AB", "O")
| eval BloodCount=mvappend("0", "5", "1", "11")
| append 
   [| makeresults 
    | eval latitude=45, longitude=-100
    | eval BloodTrans=mvappend("A", "B", "AB", "O")
    | eval BloodCount=mvappend("6", "3", "0", "8")
    ]
| eval Bloods=mvzip(BloodTrans,BloodCount,"!!!!")
| table _time latitude longitude Bloods
| mvexpand Bloods
| rex field=Bloods "^(?<bloodType>[^!]*)!!!!(?<bloodTypeCount>.*)$"
| table _time latitude longitude bloodType bloodTypeCount

everything above this was just to generate some data points for the following

  | geostats latfield=latitude longfield=longitude sum(bloodTypeCount) by bloodType

View solution in original post

DalJeanis · ‎01-24-2017

Your example is fine. Solution is below.

DalJeanis · ‎01-24-2017

The problem is that geostats is getting confused between the count field that came out of the stats command and the count it's generating itself. The latter, being more "local", will supersede and hide the former. So, just try this -

index=someindex sourcetype=somesource 
| stats count as bloodTypeCount by longitude,latitude,bloodType 
| geostats latfield=longitude longfield=longitude sum(bloodTypeCount) as count by bloodType

verified with the following test code

| makeresults 
| eval latitude=32, longitude=-96
| eval BloodTrans=mvappend("A", "B", "AB", "O")
| eval BloodCount=mvappend("0", "5", "1", "11")
| append 
   [| makeresults 
    | eval latitude=45, longitude=-100
    | eval BloodTrans=mvappend("A", "B", "AB", "O")
    | eval BloodCount=mvappend("6", "3", "0", "8")
    ]
| eval Bloods=mvzip(BloodTrans,BloodCount,"!!!!")
| table _time latitude longitude Bloods
| mvexpand Bloods
| rex field=Bloods "^(?<bloodType>[^!]*)!!!!(?<bloodTypeCount>.*)$"
| table _time latitude longitude bloodType bloodTypeCount

everything above this was just to generate some data points for the following

  | geostats latfield=latitude longfield=longitude sum(bloodTypeCount) by bloodType

charliedgz · ‎01-25-2017

Worked as expected. Much appreciated. I had tried using sum before but I didn't do the 'count by' bit. Thanks for including the test code... that taught me a lot right there.

And I will have you know that the geostats panel went from 1min30 seconds before to 38 seconds to complete the same data set of 2.6 mil events. Definitely an improvement!

DalJeanis · ‎01-25-2017

Nice!

If you are trying to speed the response, make sure that the data to be analyzed has been culled at the earliest point possible -- use table or fields commands to limit the data to only the fields you need -- and do any reformatting at the latest point possible. For instance, pass all the dates as Epoch time and do summary/stats commands in that form, then at the last moment format them for human consumption.

Can you improve the performance of the "geostats" command by using a non-streaming command first?

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!

Are you a member of the Splunk Community?

Can you improve the performance of the "geostats" command by using a non-streaming command first?

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!