I have a transaction query that returns groups of logs that are typically 5-10 events clumped together. The query uses startswith/endswith. I really only care about the first and last event. Is there any way I can pull these out or at least pull out some of the data from a field in the first and a field in the last?
I get the issue here. What transaction does is creates a multivalued field for each available field in merged events and by default the multivalued field is create in the same way as | stats values(field)
, so basically they are sorted alphabetically and duplicate values are removed.
What you need to do is to use mvlist=t
option in your transaction command and it'll retain the order of the field values (similar to | stats list(field)
), then the mvindex portion would work just fine. Like this
transaction query with mvlist=t | eval first_domain=mvindex(destination, 0) | eval last_domain=mvindex(destination, -1) | table user,first_domain,last_domain
Update
Try this for removing duplicates from user (or any field)
transaction query with mvlist=t | eval first_domain=mvindex(destination, 0) | eval last_domain=mvindex(destination, -1) | eval user=mvdedup(user) | table user,first_domain,last_domain
I get the issue here. What transaction does is creates a multivalued field for each available field in merged events and by default the multivalued field is create in the same way as | stats values(field)
, so basically they are sorted alphabetically and duplicate values are removed.
What you need to do is to use mvlist=t
option in your transaction command and it'll retain the order of the field values (similar to | stats list(field)
), then the mvindex portion would work just fine. Like this
transaction query with mvlist=t | eval first_domain=mvindex(destination, 0) | eval last_domain=mvindex(destination, -1) | table user,first_domain,last_domain
Update
Try this for removing duplicates from user (or any field)
transaction query with mvlist=t | eval first_domain=mvindex(destination, 0) | eval last_domain=mvindex(destination, -1) | eval user=mvdedup(user) | table user,first_domain,last_domain
I had been looking over the data and realized the alphabetical order as well right before you posted! Using mvlist=t
it extracts the first and last exactly as intended! One small hiccup is that it lists the user 9 times (once for each log?) in the table. Any ideas on that? This is still something I can accept as the first/last are the most important to me but cleaning it up would be 11/10!
user first_domain last_domain
jondoe alibaba.com malware.com
jondoe
jondoe
jondoe
jondoe
jondoe
jondoe
jondoe
jondoe
That works! I got it to work by using eval and grabbing the first user in the logs, but I think your dedup solution is cleaner. I really appreciate all the help!! I learned about the mv commands now and learned I still have a lot to learn regarding Splunk. Thank you very much.
Below is some sample data of my transaction query (9 events grouped together as one)
9/25 12:00:01 alibaba.com jondoe
9/25 12:00:01 reddit.com jondoe
9/25 12:00:01 www.gmail.com jondoe
...
...
...
...
...
9/25 12:00:01 malware.com jondoe
And here is my what I currently am getting
query:
transaction query | eval first_domain=mvindex(destination, 0) | eval last_domain=mvindex(destination, -1) | table user,first_domain,last_domain
query results:
user first_domain last_domain
jondoe alibaba.com www.google.com
Try this
base search with transaction | eval field_first=mvindex(field, 0) | eval field_last=mvindex(field, mvcount(field)-1)
*UPDATED*
base search | transaction query mvlist=t | eval field_first=mvindex(field, 0) | eval field_last=mvindex(field, -1)
I got it to give me the first website someone goes to, but I can't get it to give me the last. There are 9 total events grouped together in my transaction test. It's giving me:
last_site=from_log 3 (not the last one, log 9)
eval first_site=mvindex(destination, 0) | eval last_site=mvindex(destination, mvcount(dest_hostname)-1) | table first_site,last_site
Any ideas?
You're using wrong field name in the mvcount function. If fact you could just remove that function completely, using just -1 will select the last item.
eval first_site=mvindex(destination, 0) | eval last_site=mvindex(destination,-1) | table first_site,last_site
I tried making the changes, but still getting the same thing. Any idea? I'm at a loss because it would seem to make sense but isn't. I posted some sample data in the original post to better clarify the issue and show what is happening.