All Apps and Add-ons

[Help] - How to remove duplicated events in this specific scenario

cheriemilk
Path Finder

Hi Team,

I have audited below user behavior data on web UI. 

For ACT=OPEN_PAGE, which means user open a web UI page, and based on the log audit mechanism.

1. Sometime there're two events returned. The difference between these 2 events are:

  • One contains  "DT=PAGEPERFORMANCE", another one doesn't.
  • The sub-number in CAID field is different. The first generated is always CAID=xxxxxxxx-X, the second generated is always CAID=xxxxxxxx-0.

2. Sometimes there's only one returned, it can't be forsee that which one is returned. Eeither the one with CAID=xxxxxxxx-X or CAID=xxxxxxxx-0

1. 2020-08-11 02:46:49,435 DT=PAGEPERFORMANCE  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-X  ACT=OPEN_PAGE
2. 2020-08-11 02:46:49,438  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-0 ACT=OPEN_PAGE
3. 2020-08-11 02:46:49,467  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-1  ACT=SEARCH
4. 2020-08-11 02:46:50,222 DT=PAGEPERFORMANCE  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-X ACT=OPEN_PAGE
5. 2020-08-11 02:46:50,333  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-0 ACT=START_OVER
6. 2020-08-11 02:46:52,321  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-X ACT=OPEN_PAGE
7. 2020-08-11 02:46:52,469  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-0 ACT=SEARCH

Question:

1. If there're two events with ACT=OPEN_PAGE (CAID=xxxxxxxx-X ANDCAID=xxxxxxxx-0), how can I keep the one with 'DT=PAGEPERFORMANCE ' , and remove the other one. 

2. If there're only 1 events with ACT=OPEN_PAGE,  no need to remove it. 

Expected Result looks like this (2nd events is removed.)

1. 2020-08-11 02:46:49,435 DT=PAGEPERFORMANCE  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-X  ACT=OPEN_PAGE
2. 2020-08-11 02:46:49,438  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-0 ACT=OPEN_PAGE
3. 2020-08-11 02:46:49,467  httpsessionID=sid1 UID=userid1 UN=username1 LOC=en_US CAID=8513937907-1  ACT=SEARCH
4. 2020-08-11 02:46:50,222 DT=PAGEPERFORMANCE  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-X ACT=OPEN_PAGE
5. 2020-08-11 02:46:50,333  httpsessionID=sid2 UID=userid2 UN=username2 LOC=en_US CAID=1512937904-0 ACT=START_OVER
6. 2020-08-11 02:46:52,321  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-X ACT=OPEN_PAGE
7. 2020-08-11 02:46:52,469  httpsessionID=sid3 UID=userid3 UN=username3 LOC=en_US CAID=3903937111-0 ACT=SEARCH

 

Thanks,

Cherie

Labels (1)
0 Karma

to4kawa
Ultra Champion

What does your keep and remove mean?

0 Karma

cheriemilk
Path Finder

Keep and remove means showing or not showing the event in the query result.

0 Karma

to4kawa
Ultra Champion

your search
| rex field=CAID "(?<subCAID>.*)-"
| dedup subCAID ACT

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...