Scenario:
I am searching email event logs. I can find some of the needed fields by a unique id (UID) and I find some fields by diffferent unique id (X-UID). Some events contain both UID and X-UID but not all the fields I need.
Here is a sample of the code:
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID]
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID
[search index=mail sourcetype=xemail sender = "sender@domain.com" |stats count by XUID| fields XUID]
|stats list(dest) as dest_ip list(sender) by XUID
Ultimately I would like results to show
subj sender recp status dest_ip
Thank you
Re your long comment on the question: That's exactly what transaction does, even spanning multiple chained ID fields.
Here's a working example:
| stats count as raw | eval raw = "subject=foo uid=123
subject=foo2 uid=321
sender=bar2 uid=321 xuid=cba
sender=bar uid=123 xuid=abc
recipient=baz xuid=abc
recipient=baz2 xuid=cba" | makemv delim="
" raw | mvexpand raw | rename raw as _raw | extract
| eval _time = time()-(random()%1000) | sort - _time
| transaction uid xuid | table _time duration eventcount subject sender recipient uid xuid
Make sure you keep the line breaks as they are here, that's important for this dirty kind of dummy data generation from within the search bar.
First I set up six events, three events per email, each event containing only one "email-y" field.
The events for subject and sender are tied together with uid, the events for sender and recipient are tied together with xuid, and the event for sender ties together the uid and the xuid giving you a nice transitive transaction.
If you want to search for subject, sender, etc before building the transaction you can either do that manually:
The good case: You have an event with the field to filter by (say, sender) and both ID fields.
index=mail sourcetype=xemail
[ search index=mail sourcetype=xemail sender=foo | fields UID XUID | dedup UID XUID | format "(" "(" "OR" ")" "OR" ")" ]
| transaction UID XUID
That will search for events matching your sender and use the UID and XUID field to search all potential matches beyond the "sender-event", then build the transaction from there.
The bad case, #1: You have an event with the field to filter by, but only the UID field.
index=mail sourcetype=xemail
[ search index=mail sourcetype=xemail
[ search index=mail sourcetype=xemail sender=foo | fields UID | dedup UID ]
| fields UID XUID | dedup UID XUID | format "(" "(" "OR" ")" "OR" ")" ]
| transaction UID XUID
The innermost subsearch will go in, search for your sender, and come back with a list of UIDs. Those are inserted into the outer subsearch, that will go and retrieve all events with that UID - some of those will have the missing XUID! From there it proceeds like the good case above, using UID or XUID to collect together all relevant events and run the transaction.
The bad case, #2: You have an event with the field to filter by, but only the XUID field.
This works like the bad case #1, but you need to add two Xs to the innermost subsearch... so when filtering, you need to know which of the two bad cases to run ![]()
Even though the two bad cases mean you have to go through your index thrice, each run should be a fairly rare search. This will be slower if you search for a sender that sent 90% of all emails, mind.
Instead of building those subsearch-monsters manually, there is a much-forgotten search command searchtxn to do just that for you (I think, don't have data handy to actually test).
To use that, you first have to set up a transaction type in transactiontypes.conf like this:
[xemail]
fields = UID, XUID
search = index=mail sourcetype=xemail
To confirm that this type works, run a regular non-filtering search with | transaction name=xemail and see that it returns the same things as manually specifying the fields. Once that's done, run this with nothing else in the search bar:
| searchtxn xemail sender=foo
That should collect together all the required IDs and neatly return only matching transactions without scanning everything.
http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/searchtxn
One caveat about searchtxn, it's not going to honour your time range picker.
Hi Martin,
Thank you for assisting with this.
The email logs I analyze contain multiple events per session, and the fields I want, need to be correlated from all the events related to that email session. For example the sender is in one event, the subject is in a different event, the recipient is in a different, but they all share a field UID value. That is the reason for this part of the code.
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID]
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID
Now I have other fields values that I need from other separate events that share a field XUID value. The trick is trying to correlate all events that share UID value and XUID value. I have not found the right key. There are events that contain both the UID and XUID fields but I have not figured out how to grab all the fields with from that event.
I hope that makes sense, if not I will try to mock up some event logs.
Thank you
Try this - I don't think it is exactly what you want, but it should be closer
index=mail sourcetype=xemail
| stats values(subject) as subject values(sender) as sender values(recipient) as recipient
values(vendor_action) as status values(dest_ip) as dest_ip by UID
| append [ search index=mail sourcetype=xemail
| stats values(subject) as subject values(sender) as sender values(recipient) as recipient
values(vendor_action) as status values(dest_ip) as dest_ip by XUID ]
Your original search seems overcomplicated.
Thank you for the suggestion, I will give it a try.
Hi Lisa,
To explain the apparent "over-complication". The fields I want are in separate events, so I need a key (e.g. UID) to correlate all the events from a specific email session, giving me results with subj, sender, recipient, etc. with the same UID. IF you know a better way to accomplish that I will definitely try it.
So with this question, specifically some fields I want are in events with UID and others are in events with XUID. Therefore I need to find the key (e.g. sender) that correlates to the other events without XUID.
Its complicated because the events don't contain all the fields I need.
I hope that makes sense.
Some sample data would help.
I made a mistake the Key is "sender".
The logs I need to join/correlate will have the sender in common.
For example
Fields UID are subject, sender, recipient, vendor_action
Fields XUID are sender, dest
Unfortunately I can release the actual data.
If you could update your question with just 5 or 10 anonymized/sample events to illustrate what you have and what you want to end up with, that would really help immensely.