Improve transaction speed

HansWurscht · ‎03-14-2014

Hi,

i'm having the following dataset:

2014-03-15 17:23:17 host2 transaction="7WB1Hh7VpxWsDae" action="request" uri="/bla/fasel/bar.png" requestSize="123" 2014-03-15 17:23:18 host1 transaction="We6TaPibMJhdYI5" action="request" uri="/bla/fasel/foo.jpg" requestSize="127" 2014-03-15 17:23:19 host2 transaction="7WB1Hh7VpxWsDae" ation="response" responseSize="45678" code="200" 2014-03-15 17:23:20 host1 transaction="We6TaPibMJhdYI5" ation="response" responseSize="4567" code="200"

I need the following aggregated table as a result:

| uri | sum(requestSize) | sum(responseSize) | +--------------------+------------------+-------------------+ | /bla/fasel/bar.png | 123 | 45678 | | /bla/fasel/foo.jpg | 127 | 4567 | +--------------------+------------------+-------------------+

The maximum time of a transaction is 600 seconds (device timeout).
Until now i'm using the following search:
sourcetype=foo | transaction maxspan=600 maxevents=2 host,transaction | stats sum(responseSize), sum(requestSize) by uri

Working on a larger dataset, this doesn't really scale well. The runtime is serval hours.

Is there a better way to archive the desired results?
Are there some tricks to optimize this query, especially the transaction command?
Maybe there is a workaround by using stats only?

I'm looking forward to your ideas, thanks in advance!

-Lorenz

martin_mueller · ‎03-14-2014

Using stats may be an option if the transaction ID field is unique over the entire time range. In many cases it's faster, but it doesn't have to be. You'd do something like this:

sourcetype=foo | stats first(uri) as uri first(requestSize) as requestSize first(responseSize) as responseSize by host transaction | stats sum(requestSize) sum(responseSize) by uri

Note, this will horribly break if you have non-unique transaction IDs, and it will also go up in flames if you have more than one event per transaction ID with the fields uri, requestSize or responseSize.

The best way to speed up either transaction or stats is of course to throw out as many events in your base search as possible... does your search for sourcetype=foo yield any events you don't need?
Beyond that, you can build summary indexing searches that compute e.g. an hourly pre-summary which makes building the overall summary lightning quick. That's a bit tricky to do... if you need in-person help for that, you sound not too far away from me 😉

martin_mueller · ‎03-19-2014

Missing entries are odd indeed. Are those occurring while hitting the apparent memory limit or not?

As for the memory limit, hitting the 200mb limit for max_mem_usage_mb should cause swapping to disk, but should not mess with the results. If your missing entries et.al. are reproducible you could bump up the value in limits.conf to see if it goes away.

In production use I'd have short summarizing searches run over a small time range frequently - memory would not be a concern.

HansWurscht · ‎03-17-2014

It seems (search.log) as if the system is running against a memory limit of 200 mb.

HansWurscht · ‎03-17-2014

Thanks,

sourcetype=foo only contains events, that i need. transaction is unique per host, so this really should work.

In a first test on a three hour sample, search time was reduced from 3300 seconds to 800 seconds.

But i discovered that the output of the two searches (transaction and with stats) differs:
* Grouping sometime fails
* Some entries are missing

What could be the problem here?

MuS · ‎03-14-2014

Hi HansWurscht,

try something like this:

 ... | transaction maxspan=600s maxevents=2 startswith="request" endswith="response" "host" "transaction" | ...

I often get better performance if I put the needed transaction fields in quotes because if a quoted list of fields is specified, events are grouped together if they have the same value for each of the fields.
Also if you define a startswith and a endswith and set the maxspan to seconds or minutes it should be faster.

hope this helps ...

cheers, MuS

MuS · ‎03-17-2014

then try what martin_mueller suggested and try to build a base search which will return the best result for you. Also try streamstats

HansWurscht · ‎03-17-2014

In my case, this did not increase search performance.

HansWurscht · ‎03-14-2014

Thanks,

i will try to put the transaction fields in quotes and will report about possible performance improvements.

I think i can't use startswith and endswith, because request and response might arrive in reverse order at my splunk system (transfer by syslog).

-Lorenz

Improve transaction speed

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?