Using streamstats and mvmap.
| makeresults
| eval data=split("2022-08-15 00:00:00 user@example.com SpamPolicyName A.com;B.com;C.com,2022-08-10 00:00:00 user@example.com SpamPolicyName A.com;B.com", ",")
| mvexpand data
| rex field=data "(?<t>.{19}) (?<email>[^ ]*) (?<domains>.*)"
| eval _time=strptime(t, "%F %T")
| table _time email domains
``` Now here is what you need ```
``` Sort thenm in ascending time order ```
| sort _time
| eval domains=split(domains, ";")
``` Now copy previous date to later date ```
| streamstats window=1 current=f values(domains) as prev_domains
``` and remove 'first' event with no previous ```
| where isnotnull(prev_domains)
``` Now find additions ```
| eval additions=mvmap(domains, if(isnull(mvfind(prev_domains, domains)), domains, null()))
``` this will also find removals ```
| eval removals=mvmap(prev_domains, if(isnull(mvfind(domains, prev_domains)), prev_domains, null()))
First part sets up your example. Note that this also finds any 'removals'. i.e. where latest data does NOT have domains from earlier date.
Using streamstats and mvmap.
| makeresults
| eval data=split("2022-08-15 00:00:00 user@example.com SpamPolicyName A.com;B.com;C.com,2022-08-10 00:00:00 user@example.com SpamPolicyName A.com;B.com", ",")
| mvexpand data
| rex field=data "(?<t>.{19}) (?<email>[^ ]*) (?<domains>.*)"
| eval _time=strptime(t, "%F %T")
| table _time email domains
``` Now here is what you need ```
``` Sort thenm in ascending time order ```
| sort _time
| eval domains=split(domains, ";")
``` Now copy previous date to later date ```
| streamstats window=1 current=f values(domains) as prev_domains
``` and remove 'first' event with no previous ```
| where isnotnull(prev_domains)
``` Now find additions ```
| eval additions=mvmap(domains, if(isnull(mvfind(prev_domains, domains)), domains, null()))
``` this will also find removals ```
| eval removals=mvmap(prev_domains, if(isnull(mvfind(domains, prev_domains)), prev_domains, null()))
First part sets up your example. Note that this also finds any 'removals'. i.e. where latest data does NOT have domains from earlier date.