Hi,
i'm having the following dataset:
2014-03-15 17:23:17 host2 transaction="7WB1Hh7VpxWsDae" action="request" uri="/bla/fasel/bar.png" requestSize="123"
2014-03-15 17:23:18 host1 transaction="We6TaPibMJhdYI5" action="request" uri="/bla/fasel/foo.jpg" requestSize="127"
2014-03-15 17:23:19 host2 transaction="7WB1Hh7VpxWsDae" ation="response" responseSize="45678" code="200"
2014-03-15 17:23:20 host1 transaction="We6TaPibMJhdYI5" ation="response" responseSize="4567" code="200"
I need the following aggregated table as a result:
| uri | sum(requestSize) | sum(responseSize) |
+--------------------+------------------+-------------------+
| /bla/fasel/bar.png | 123 | 45678 |
| /bla/fasel/foo.jpg | 127 | 4567 |
+--------------------+------------------+-------------------+
The maximum time of a transaction is 600 seconds (device timeout).
Until now i'm using the following search:
sourcetype=foo | transaction maxspan=600 maxevents=2 host,transaction | stats sum(responseSize), sum(requestSize) by uri
Working on a larger dataset, this doesn't really scale well. The runtime is serval hours.
Is there a better way to archive the desired results?
Are there some tricks to optimize this query, especially the transaction command?
Maybe there is a workaround by using stats only?
I'm looking forward to your ideas, thanks in advance!
-Lorenz
Using stats
may be an option if the transaction ID field is unique over the entire time range. In many cases it's faster, but it doesn't have to be. You'd do something like this:
sourcetype=foo | stats first(uri) as uri first(requestSize) as requestSize first(responseSize) as responseSize by host transaction | stats sum(requestSize) sum(responseSize) by uri
Note, this will horribly break if you have non-unique transaction IDs, and it will also go up in flames if you have more than one event per transaction ID with the fields uri, requestSize or responseSize.
The best way to speed up either transaction or stats is of course to throw out as many events in your base search as possible... does your search for sourcetype=foo yield any events you don't need?
Beyond that, you can build summary indexing searches that compute e.g. an hourly pre-summary which makes building the overall summary lightning quick. That's a bit tricky to do... if you need in-person help for that, you sound not too far away from me 😉
Missing entries are odd indeed. Are those occurring while hitting the apparent memory limit or not?
As for the memory limit, hitting the 200mb limit for max_mem_usage_mb
should cause swapping to disk, but should not mess with the results. If your missing entries et.al. are reproducible you could bump up the value in limits.conf to see if it goes away.
In production use I'd have short summarizing searches run over a small time range frequently - memory would not be a concern.
It seems (search.log) as if the system is running against a memory limit of 200 mb.
Thanks,
sourcetype=foo
only contains events, that i need. transaction
is unique per host, so this really should work.
In a first test on a three hour sample, search time was reduced from 3300 seconds to 800 seconds.
But i discovered that the output of the two searches (transaction and with stats) differs:
* Grouping sometime fails
* Some entries are missing
What could be the problem here?
Hi HansWurscht,
try something like this:
... | transaction maxspan=600s maxevents=2 startswith="request" endswith="response" "host" "transaction" | ...
I often get better performance if I put the needed transaction fields in quotes because if a quoted list of fields is specified, events are grouped together if they have the same value for each of the fields.
Also if you define a startswith
and a endswith
and set the maxspan
to seconds or minutes it should be faster.
hope this helps ...
cheers, MuS
then try what martin_mueller suggested and try to build a base search which will return the best result for you. Also try streamstats
In my case, this did not increase search performance.
Thanks,
i will try to put the transaction fields in quotes and will report about possible performance improvements.
I think i can't use startswith
and endswith
, because request and response might arrive in reverse order at my splunk system (transfer by syslog).
-Lorenz