How can I write the following to get past the join limitation?
index=aws eventName=TerminateInstances
| Rename "requestParameters.instancesSet.items{}.instanceId" AS vm_id
| join vm_id type=left max=0
[ search index=aws source="us-west-1:ec2_instances" sourcetype="aws:description" ]
| dedup vm_id
| table _time, action, vm_id, tags.Name, userName
Hi @sb01splunk,
join is a very slow command with the limit of 50,000 results in the subsearch, so I hint to see this approach using stats command:
index=aws (eventName=TerminateInstances OR (source="us-west-1:ec2_instances" sourcetype="aws:description"))
| eval status=if(eventName=TerminateInstances,"1","2"),
vm_id=coalesce(vm_id,"requestParameters.instancesSet.items{}.instanceId")
| stats
dc(status) AS dc_status
values(status) AS status
earliest(_time) AS _time
values(action) AS action
values(tags.Name) AS tags.Name
values(userNa) AS userNa
BY vm_id
| where dc_status=1 AND status="1"
| table _time action vm_id tags.Name userName
Ciao.
Giuseppe
Depending on how many events you have in the first search, you could try swapping them.
index=aws source="us-west-1:ec2_instances" sourcetype="aws:description"
| join vm_id type=left max=0
[ search index=aws eventName=TerminateInstances
| rename "requestParameters.instancesSet.items{}.instanceId" AS vm_id ]
| dedup vm_id
| table _time, action, vm_id, tags.Name, userName
Or, you could try using stats to do the join, something like this
index=aws (eventName=TerminateInstances OR source="us-west-1:ec2_instances" sourcetype="aws:description")
| rename "requestParameters.instancesSet.items{}.instanceId" AS vm_id
| eval _event_time=if(eventNaame="TerminateInstances", _time, null())
| stats values(_event_time) as _event_time, values(_event_time) as _event_time, values(action) as action, values('tags.Name') as tags.Name, values(userName) as userName by vm_id
| eval _time=coalesce(_event_time,_time)
Thanks for getting me in the right direction. The following seems to work
(index=aws sourcetype="aws:cloudtrail" eventName=TerminateInstances) OR (index=aws source="us-west-1:ec2_instances" sourcetype="aws:description")
| eval joiner=if(sourcetype="aws:cloudtrail", 'requestParameters.instancesSet.items{}.instanceId', vm_id)
| eval Time=if(eventName="TerminateInstances", _time, null())
| eval Time=strftime(Time, "%Y-%d-%m %H:%M:%S")
| stats values(*) as * by joiner
| where 'requestParameters.instancesSet.items{}.instanceId'=vm_id
| table Time, userName, action, vm_id, "tags.Name"
if I try something like this though it returns nothing
(index=aws sourcetype="aws:cloudtrail" eventName=TerminateInstances) OR (index=aws source="us-west-1:ec2_instances" sourcetype="aws:description")
| eval joiner=if(sourcetype="aws:cloudtrail", 'requestParameters.instancesSet.items{}.instanceId', vm_id)
| eval Time=if(eventName="TerminateInstances", _time, null())
| eval Time=strftime(Time, "%Y-%d-%m %H:%M:%S")
| stats values(Time) as Time, values(userName) as userName, values(action) as action, values(vm_id) as vm_id, values('tags.Name') as tags.Name by joiner
| where 'requestParameters.instancesSet.items{}.instanceId'=vm_id
| table Time, userName, action, vm_id, "tags.Name"
I'm curious why the 2nd one returns nothing.
requestParameters.instancesSet.items{}.instanceId isn't included in the stats command in the second version so is not available for the where command