Solved: Summarizing postfix logs

PickleRick · ‎02-24-2022

I'm not that bad in searching 😉 but this case is a little over my head and I need some clever idea.

I have postfix logs. They have three types of events. All events have queue_id field which identifies the message. The events have either from, to or (to and orig_to) fields set.

I want to do

stats values(from) by to orig_to

The problem is that the fields are in separate events. And both methods of joining them together are faulted in one way or another.

If I do

eventstats values(from) by queue_id

I get the desired result but if I search over a longer timespan I'm hitting a memory limit. Sure, I can raise the limit in the config but it would be a better solution to find a more reasonable search for it 🙂

If I try to do

transaction queue_id

Of course everthing works but the transaction joins the to and orig_to fields into multivalued fields and there is no reliable way to "unjoin" them (over one transaction you can have more to values than orig_to and you can't simply do mapping between the values).

So I'm a little stuck how to transform my data to get from, to, orig_to tuples so I can later pass it to stats.

Any hints?

Of course, if nothing works I'll simply raise the limits and do eventstats but it's not a pretty solution.

PickleRick · ‎02-25-2022

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

View solution in original post

yuanliu · ‎02-25-2022

I must have missed something here. The standard way to join these field is just values(*) as * by unique ID. Does it apply?

| fields from to orig_to queue_id
| stats values(*) as * by queue_id
| stats values(from) by to orig_to

(BTW, I am curious about the semantic significance of the number. Appreciate an elaboration.)

PickleRick · ‎02-25-2022

The problem is that if I aggregate by values(*), I'll get all to and orig_to values but I lose the connection between them.

Let's say I have it like (different events, same queue_id):

from: a@b.c
to: c@d.e
to: e@f.g, orig_to: g@h.i

After values(to) I wouldn't know which of them relates to the orig_to that we know.

But it seems sorting by time and streamstatsing does the job.

PickleRick · ‎02-24-2022

OK. It seems I can do I thought about streamstats but for that I think I'd have to re-sort "backwards" time-wise - another not very best idea.

PickleRick · ‎02-25-2022

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

Summarizing postfix logs

stats

transaction

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?