All Apps and Add-ons

[Help] - How to remove duplicated events in this specific scenario

Explorer

Hi Team,

I have audited below user behavior data on web UI. 

For ACT=OPEN_PAGE, which means user open a web UI page, and based on the log audit mechanism.

1. Sometime there're two events returned. The difference between these 2 events are:

  • One contains  "DT=PAGEPERFORMANCE", another one doesn't.
  • The sub-number in CAID field is different. The first generated is always CAID=xxxxxxxx-X, the second generated is always CAID=xxxxxxxx-0.

2. Sometimes there's only one returned, it can't be forsee that which one is returned. Eeither the one with CAID=xxxxxxxx-X or CAID=xxxxxxxx-0

1. 2020-08-11 02:46:49,435 DT=PAGEPERFORMANCE  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-X  ACT=OPEN_PAGE
2. 2020-08-11 02:46:49,438  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-0 ACT=OPEN_PAGE
3. 2020-08-11 02:46:49,467  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-1  ACT=SEARCH
4. 2020-08-11 02:46:50,222 DT=PAGEPERFORMANCE  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-X ACT=OPEN_PAGE
5. 2020-08-11 02:46:50,333  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-0 ACT=START_OVER
6. 2020-08-11 02:46:52,321  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-X ACT=OPEN_PAGE
7. 2020-08-11 02:46:52,469  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-0 ACT=SEARCH

Question:

1. If there're two events with ACT=OPEN_PAGE (CAID=xxxxxxxx-X ANDCAID=xxxxxxxx-0), how can I keep the one with 'DT=PAGEPERFORMANCE ' , and remove the other one. 

2. If there're only 1 events with ACT=OPEN_PAGE,  no need to remove it. 

Expected Result looks like this (2nd events is removed.)

1. 2020-08-11 02:46:49,435 DT=PAGEPERFORMANCE  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-X  ACT=OPEN_PAGE
2. 2020-08-11 02:46:49,438  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-0 ACT=OPEN_PAGE
3. 2020-08-11 02:46:49,467  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-1  ACT=SEARCH
4. 2020-08-11 02:46:50,222 DT=PAGEPERFORMANCE  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-X ACT=OPEN_PAGE
5. 2020-08-11 02:46:50,333  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-0 ACT=START_OVER
6. 2020-08-11 02:46:52,321  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-X ACT=OPEN_PAGE
7. 2020-08-11 02:46:52,469  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-0 ACT=SEARCH

 

Thanks,

Cherie

Labels (1)
0 Karma

Ultra Champion

What does your keep and remove mean?

0 Karma

Explorer

Keep and remove means showing or not showing the event in the query result.

0 Karma

Ultra Champion

your search
| rex field=CAID "(?<subCAID>.*)-"
| dedup subCAID ACT

0 Karma