Splunk Enterprise

Duplication of row

sweety1309
Explorer

Hello everyone,I have this query-

index="dpsnapitt" AND (class= "GRADE 12 B" OR class= "GRADE 12 B *") AND (day="DAY 4" OR NOT day=*)
| rename "start time" as start, "end time" as end
| rename class as Class
| rename email as organizer_email
| dedup subject, Class, organizer_email, start
| join type=outer organizer_email
[search index="dpsn_teachers" earliest=0 latest= now
| rename name.fullName as teacher
| rename primaryEmail as organizer_email]
|join type=outer organizer_email max=0
[search index="dpsn_meet" | where email== organizer_email]
| rex field=date "(?<yy>[^\.]*)\-(?<mm>[^\.]*)\-(?<dd>[\S]*)T(?<hh>[^\.]*)\:(?<min>[^\.]*)\:(?<sec>[^\.]*)\."
| eval ndatetime = yy.mm.dd.hh.min.sec
| eval _time=strptime(ndatetime,"%Y%m%d%H%M%S") + 19800
| eval Time=strftime(_time, "%H:%M")
| eval p = strptime(start,"%H:%M")-1020|eval q = strftime(p,"%H:%M")
| eval r = strptime(end,"%H:%M") |eval s = strftime(r,"%H:%M")
| eval meet_start_time = strftime((strptime(Time,"%H:%M")-duration_seconds), "%H:%M")
| eval z = if((meet_start_time >= q AND meet_start_time <= s), 1, 0)
| eval meeting_code = case( z == 1, meeting_code, z==0, "N/A")
| dedup organizer_email, meeting_code, start, end
| table Class, teacher, organizer_email, subject, "start", "end", meeting_code
| sort - period - day | reverse

 

 

If I run this query on 4 Dec 2020,it is producing following result-see image.

Those rows where meeting code is present is getting duplicated with meeting_code "N/A".I dont want that duplicated row.

 

Please help..I would be very thankful.

 

 

 

Labels (1)
0 Karma

nickhills
Ultra Champion

This is probably because the last join returns more than 1 result for each meeting (presumably multiple events during the meeting) Hence you always have a meeting that falls both Inside and outside of your meeting_code case condition, resulting in two statuses. 

Could you use a |stats latest(end) to only bring in the last result of "end"?

...[search index="dpsn_meet" | stats latest(end) as end latest(start) as start latest(date) as date by email| where email== organizer_email]|...
If my comment helps, please give it a thumbs up!
0 Karma

sweety1309
Explorer

But now I am not getting the meeting_code which I was getting earlier in this case.U can have a look at the image.

0 Karma

nickhills
Ultra Champion

The problem is here:

 

| eval z = if((meet_start_time >= q AND meet_start_time <= s), 1, 0)
| eval meeting_code = case( z == 1, meeting_code, z==0, "N/A")

 

For any meeting which has multiple events that match both values of Z, you will get duplicates.

You could change the last dedup command to:

 

| dedup organizer_email, start, end

 

Which may solve your problem in most cases*, but you might be better off using a stats latest command to only bring in a single value to the join as per my suggestion above.

* I would expect that meetings in progress would show N/A,  meetings that have completed to show the meeting code

If my comment helps, please give it a thumbs up!
0 Karma

sweety1309
Explorer

Can u help me with any alternate options ..I need the meeting_code..Please help me with this.I am stuck here

0 Karma

sweety1309
Explorer

But meeting_code is all I need here.

0 Karma

sweety1309
Explorer

Please help me

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...