Splunk Search

Summarizing postfix logs

PickleRick
SplunkTrust
SplunkTrust

I'm not that bad in searching 😉 but this case is a little over my head and I need some clever idea.

I have postfix logs. They have three types of events. All events have queue_id field which identifies the message. The events have either from, to or (to and orig_to) fields set.

I want to do

stats values(from) by to orig_to

The problem is that the fields are in separate events. And both methods of joining them together are faulted in one way or another.

If I do

eventstats values(from) by queue_id

 I get the desired result but if I search over a longer timespan I'm hitting a memory limit. Sure, I can raise the limit in the config but it would be a better solution to find a more reasonable search for it 🙂

If I try to do

transaction queue_id

Of course everthing works but the transaction joins the to and orig_to fields into multivalued fields and there is no reliable way to "unjoin" them (over one transaction you can have more to values than orig_to and you can't simply do mapping between the values).

So I'm a little stuck how to transform my data to get from, to, orig_to tuples so I can later pass it to stats.

Any hints?

Of course, if nothing works I'll simply raise the limits and do eventstats but it's not a pretty solution.

Labels (2)
Tags (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

I must have missed something here.  The standard way to join these field is just values(*) as * by unique ID.  Does it apply?

| fields from to orig_to queue_id
| stats values(*) as * by queue_id
| stats values(from) by to orig_to

 (BTW, I am curious about the semantic significance of the number.  Appreciate an elaboration.)

PickleRick
SplunkTrust
SplunkTrust

The problem is that if I aggregate by values(*), I'll get all to and orig_to values but I lose the connection between them.

Let's say I have it like (different events, same queue_id):

from: a@b.c
to: c@d.e
to: e@f.g, orig_to: g@h.i

After values(to) I wouldn't know which of them relates to the orig_to that we know.

But it seems sorting by time and streamstatsing does the job.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. It seems I can do I thought about streamstats but for that I think I'd have to re-sort "backwards" time-wise - another not very best idea.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...