Splunk Search

Summarizing postfix logs

PickleRick
SplunkTrust
SplunkTrust

I'm not that bad in searching 😉 but this case is a little over my head and I need some clever idea.

I have postfix logs. They have three types of events. All events have queue_id field which identifies the message. The events have either from, to or (to and orig_to) fields set.

I want to do

stats values(from) by to orig_to

The problem is that the fields are in separate events. And both methods of joining them together are faulted in one way or another.

If I do

eventstats values(from) by queue_id

 I get the desired result but if I search over a longer timespan I'm hitting a memory limit. Sure, I can raise the limit in the config but it would be a better solution to find a more reasonable search for it 🙂

If I try to do

transaction queue_id

Of course everthing works but the transaction joins the to and orig_to fields into multivalued fields and there is no reliable way to "unjoin" them (over one transaction you can have more to values than orig_to and you can't simply do mapping between the values).

So I'm a little stuck how to transform my data to get from, to, orig_to tuples so I can later pass it to stats.

Any hints?

Of course, if nothing works I'll simply raise the limits and do eventstats but it's not a pretty solution.

Labels (2)
Tags (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

I must have missed something here.  The standard way to join these field is just values(*) as * by unique ID.  Does it apply?

| fields from to orig_to queue_id
| stats values(*) as * by queue_id
| stats values(from) by to orig_to

 (BTW, I am curious about the semantic significance of the number.  Appreciate an elaboration.)

PickleRick
SplunkTrust
SplunkTrust

The problem is that if I aggregate by values(*), I'll get all to and orig_to values but I lose the connection between them.

Let's say I have it like (different events, same queue_id):

from: a@b.c
to: c@d.e
to: e@f.g, orig_to: g@h.i

After values(to) I wouldn't know which of them relates to the orig_to that we know.

But it seems sorting by time and streamstatsing does the job.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. It seems I can do I thought about streamstats but for that I think I'd have to re-sort "backwards" time-wise - another not very best idea.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Not a very elegant idea but it seems to work quite OK (and it doesn't exceed the limits).

index=postfix (some_additional_limiting_conditions) (to=* OR from=* OR orig_to=*)
| fields queue_id to from orig_to _time
| sort 0 _time
| streamstats current=t window=100 values(from) as from by queue_id
| stats values(from) as from latest(_time) as _time by to orig_to

That's pretty much what I needed all along 🙂

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...