Splunk Search

Splunk Dedup by _time and Combine Values Simultaneously

Engager

I am working with Exchange 2010 data. I have the MessageID, Sender, Recipients, and _time. Depending on the event type, recipients can be split (i.e. all recipients for a given message are not included in the event, but are split across multiple events). Here is an example of the data:

_time    MessageID    Sender    Recipients
4:25      <12345>     Sender1   Recipient1
4:50      <12345>     Sender1   Recipient2

I use this query to combine multiple Recipients values into one event, by MessageID and Sender:

index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
| stats values(Recipients) as recip by MessageID, Sender
| table _time, MessageID, Sender, recip

This results in:

_time     MessageID   Sender    recip
          <12345>     Sender1   Recipient1; Recipient2

_time is not populated, because values wouldn't work if _time was included in the by statement since each event for a particular MessageID happens at a different time. Therefore, I need to somehow dedup MessageID (to get the latest _time) and stuff Recipients values simultaneously.

I attempted this:

| stats max(_time) as datetime by MessageID, Sender
| stats values(Recipients) as recip by datetime, MessageID, Sender
| table datetime, MessageID, Sender, recip

But this doesn't work either, and it results in recip not being populated:

_time    MessageID    Sender    recip
4:50      <12345>     Sender1   

My desired output looks like this:

_time     MessageID   Sender    recip
4:50      <12345>     Sender1   Recipient1; Recipient2

How should I achieve that?

0 Karma
1 Solution

Revered Legend

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

View solution in original post

Revered Legend

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

View solution in original post

Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!