Splunk Search

How to find the last regex match for a multi-valued field in a transaction

responsys_cm
Builder

We're finding that when large files are downloaded from the Internet, the application whitelisting client reports a "new file" with a different hash multiple times as the download completes.

I essentially want to dedup the events by host, file, path but only over a very limited time window (like 5 minutes). If I use transaction to group those events and it puts the values of the file hash in a multi-value field (in time order, not sort order I believe), how do I extract just that last hash?

Or is there some way to combine dedup and bins or something like that.

Thanks.

C

Tags (2)
0 Karma
1 Solution

Ayn
Legend

How about using bucket (aka bin) and stats instead?

... | bucket _time span=5m | stats last(hash) by _time

If you still want to go the transaction / mvfield route you could probably reach some success by using eval's mvindex function (an index of "-1" returns the last item in the list).

View solution in original post

Ayn
Legend

How about using bucket (aka bin) and stats instead?

... | bucket _time span=5m | stats last(hash) by _time

If you still want to go the transaction / mvfield route you could probably reach some success by using eval's mvindex function (an index of "-1" returns the last item in the list).

View solution in original post

kristian_kolb
Ultra Champion

over all events. But if you do a

... | dedup hash _time | ...

you'll dedup the combination of the fields, so in this case you'll get one hash per bucket of time.

/K

0 Karma

responsys_cm
Builder

Thanks, Ayn! Quick question, if I run dedup after the bucket command, will Splunk only dedup events in each bucket or will it dedup over all events?

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!