Splunk Search

Summarizing postfix logs

PickleRick
Ultra Champion

I'm not that bad in searching 😉 but this case is a little over my head and I need some clever idea.

I have postfix logs. They have three types of events. All events have queue_id field which identifies the message. The events have either from, to or (to and orig_to) fields set.

I want to do

stats values(from) by to orig_to

The problem is that the fields are in separate events. And both methods of joining them together are faulted in one way or another.

If I do

eventstats values(from) by queue_id

 I get the desired result but if I search over a longer timespan I'm hitting a memory limit. Sure, I can raise the limit in the config but it would be a better solution to find a more reasonable search for it 🙂

If I try to do

transaction queue_id

Of course everthing works but the transaction joins the to and orig_to fields into multivalued fields and there is no reliable way to "unjoin" them (over one transaction you can have more to values than orig_to and you can't simply do mapping between the values).

So I'm a little stuck how to transform my data to get from, to, orig_to tuples so I can later pass it to stats.

Any hints?

Of course, if nothing works I'll simply raise the limits and do eventstats but it's not a pretty solution.

Labels (2)
Tags (2)
0 Karma
1 Solution

PickleRick
Ultra Champion

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

I must have missed something here.  The standard way to join these field is just values(*) as * by unique ID.  Does it apply?

| fields from to orig_to queue_id
| stats values(*) as * by queue_id
| stats values(from) by to orig_to

 (BTW, I am curious about the semantic significance of the number.  Appreciate an elaboration.)

PickleRick
Ultra Champion

The problem is that if I aggregate by values(*), I'll get all to and orig_to values but I lose the connection between them.

Let's say I have it like (different events, same queue_id):

from: a@b.c
to: c@d.e
to: e@f.g, orig_to: g@h.i

After values(to) I wouldn't know which of them relates to the orig_to that we know.

But it seems sorting by time and streamstatsing does the job.

0 Karma

PickleRick
Ultra Champion

OK. It seems I can do I thought about streamstats but for that I think I'd have to re-sort "backwards" time-wise - another not very best idea.

0 Karma

PickleRick
Ultra Champion

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

0 Karma
Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...