I am trying to find the difference of the dns type values for each domain in each time bucket. Let's say there are 10 queries and 22 responses for a given domain. The answer would be 22-10=12. My current search looks like this:
sourcetype=dns | bucket _time span=10m | rex "(?i)^.+\\s{2}\\..*?(?P<domain_root>[^\\.]+\\.[^\\.]+)(?=.$)" | stats count as c by domain_root _time dns_type
This produces the number of queries and replies per domain, but I don't know how to subtract them. I came across what may be a different approach using:
| stats count AS c0 count(eval(dns_type="Q")) AS cq count(eval(dns_type="R")) AS cr by domain_root _time | eval d=cq-cr
If this second approach is viable then would it be followed up with eval d=cq-cr even though cr and cq are for each domain and time bucket or is there another solution? Even if the count-eval method has a solution, I am still interested in understanding how the calculations are done when the "by" term is invoked and I am interested in knowing if a solution exists for the first method (above) using "count by domain_root time dns_type"
The second stats
seems reasonable to me. The issue with the first one is that eval
works on a per-event / per-row basis, so you'd have to merge each pair first before doing the calculation - the second stats
already does that for you.
The second stats
seems reasonable to me. The issue with the first one is that eval
works on a per-event / per-row basis, so you'd have to merge each pair first before doing the calculation - the second stats
already does that for you.
That'll run. Let func
be avg
for example, then this will add a field c
to every event that is the value of a
plus the average of a
calculated for each combination of x
, y
, and z
.
You are very sharp. Let's consider eventstats, which keeps "a" then.
eventstats func(a) AS b by x y z | eval c=a+b
That's going to be trouble as well, there is no field called a
after the stats
.
You are correct. The "sum" function requires the stat function. I was thinking about something more like this:
stats func(a) AS b by x y z | eval c=a+b
That eval
isn't going to run.
As for that stats
, it will create a table with four columns: x y z b
You'll get one row for every combination of x, y, and z, and b will be func(a)
for events matching that combination.
So
stats func(a) AS b by x y z | eval s=sum(b)
in effect creates the variable b.x.y.z so that eval "s=sum(b)"is really in effect "s=sum(b.x.y.z)" which sums for each unique combination of x, y, and z so that "table s x y z" can show a different value of "s" for each x, y, and z combination. Is this correct?
Yup, per row / per domain_root
and _time
.
If I did the second stats approach, how would the eval look?
| stats count AS c0 count(eval(dns_type="Q")) AS cq count(eval(dns_type="R")) AS cr by domain_root _time | eval d=cq-cr
If so, then would this "d" be per domain and time bucket?