Splunk Search

Splunk Dedup by _time and Combine Values Simultaneously

cameronwt
Engager

I am working with Exchange 2010 data. I have the MessageID, Sender, Recipients, and _time. Depending on the event type, recipients can be split (i.e. all recipients for a given message are not included in the event, but are split across multiple events). Here is an example of the data:

_time    MessageID    Sender    Recipients
4:25      <12345>     Sender1   Recipient1
4:50      <12345>     Sender1   Recipient2

I use this query to combine multiple Recipients values into one event, by MessageID and Sender:

index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
| stats values(Recipients) as recip by MessageID, Sender
| table _time, MessageID, Sender, recip

This results in:

_time     MessageID   Sender    recip
          <12345>     Sender1   Recipient1; Recipient2

_time is not populated, because values wouldn't work if _time was included in the by statement since each event for a particular MessageID happens at a different time. Therefore, I need to somehow dedup MessageID (to get the latest _time) and stuff Recipients values simultaneously.

I attempted this:

| stats max(_time) as datetime by MessageID, Sender
| stats values(Recipients) as recip by datetime, MessageID, Sender
| table datetime, MessageID, Sender, recip

But this doesn't work either, and it results in recip not being populated:

_time    MessageID    Sender    recip
4:50      <12345>     Sender1   

My desired output looks like this:

_time     MessageID   Sender    recip
4:50      <12345>     Sender1   Recipient1; Recipient2

How should I achieve that?

0 Karma
1 Solution

somesoni2
Revered Legend

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

View solution in original post

somesoni2
Revered Legend

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...