Solved: Why are similar rows not grouping together?

scaparelli · ‎08-10-2022

For some reason there are entries that are not grouped together, but obviously look like they should be. In the following table, 2 rows with serviceTicketId = 00dcfe68-25d8-4c58-9228-5fc8f7ddb9d1 are on separate rows, other serviceTicketIds such as 00c093f4fc527e5ff7006566b1a0fd90 have one row, but multiple event times.

Screen Shot 2022-08-10 at 3.59.11 PM.png

Here is my query:


(index=k8s_main "*Published successfully event=[com.nordstrom.customer.event.OrderLineReturnReceived*") OR (index="k8s_main" cluster="nsk-oak-prod" "namespace"=app04096 "*doPost - RequestId*") OR (index=k8s_main container_name=fraud-single-proxy-listener message="Successfully sent payload to kafka topic=order-events-avro*" contextMap.eventType="OrderLineReturnReceived")
| rename contextMap.orderId AS nefiOrderId contextMap.serviceTicketId AS nefiServiceTicketId
| rex field=eventKey "\[(?<omsOrderId>.*)\]"
| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]"
| rex "RequestId:(?<omniServiceTicketId>.*? )"
| rex "\"orderNumber\":\"(?<omniOrderId>.*?)\""
| eval appId = mvappend(container_name, app)
| eval orderId = mvappend(nefiOrderId, omsOrderId, omniOrderId)
| eval serviceTicketId = mvappend(nefiServiceTicketId, omsServiceTicketId, omniServiceTicketId) 
| stats dc(_time) AS eventCount values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId
| eval timeElapsed = now() - eventTime

bowesmana · ‎08-10-2022

This is probably a result of trailing spaces on your split by fields.

Your rex statement for serviceTicketId is greedy, in that it grabs everything

| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]

If you do a final

| eval serviceTicketId=":".serviceTicketId.":"

and the same with orderId you will see if there's a leading/trailing space.

Couple of other points - dc(_time) will count unique times, but if you have two events at the same time, you will only get 1. Should you use 'count' instead.

Also elapsedTime calculation will not work for the multivalue eventTime field.

You could do this

...
| stats count AS eventCount min(_time) as firstEvent values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId
| eval timeElapsed = now() - firstEvent

or if you want to do elapsed time between first and last event of the particular ticket, do

...
| stats count AS eventCount range(_time) as timeElapsed values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId

Hope this helps

View solution in original post

bowesmana · ‎08-10-2022

This is probably a result of trailing spaces on your split by fields.

Your rex statement for serviceTicketId is greedy, in that it grabs everything

| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]

If you do a final

| eval serviceTicketId=":".serviceTicketId.":"

and the same with orderId you will see if there's a leading/trailing space.

Couple of other points - dc(_time) will count unique times, but if you have two events at the same time, you will only get 1. Should you use 'count' instead.

Also elapsedTime calculation will not work for the multivalue eventTime field.

You could do this

...
| stats count AS eventCount min(_time) as firstEvent values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId
| eval timeElapsed = now() - firstEvent

or if you want to do elapsed time between first and last event of the particular ticket, do

...
| stats count AS eventCount range(_time) as timeElapsed values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId

Hope this helps

inventsekar · ‎08-13-2022

Hi @bowesmana ...

>> Your rex statement for serviceTicketId is greedy, in that it grabs everything

| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]

yes, that is true, ... one should almost never use the greedy grep(".*" )..

now, then, whats ur suggestion about how we should modify this rex?.. (i think we may need the logs to modify the rex, right)

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !

Why are similar rows not grouping together?

stats

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases

Are you a member of the Splunk Community?

Why are similar rows not grouping together?

stats

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases