Splunk Search

Why are similar rows not grouping together?

scaparelli
Explorer

For some reason there are entries that are not grouped together, but obviously look like they should be. In the following table, 2 rows with serviceTicketId = 00dcfe68-25d8-4c58-9228-5fc8f7ddb9d1 are on separate rows, other serviceTicketIds such as 00c093f4fc527e5ff7006566b1a0fd90 have one row, but multiple event times.

Screen Shot 2022-08-10 at 3.59.11 PM.png

Here is my query:


(index=k8s_main "*Published successfully event=[com.nordstrom.customer.event.OrderLineReturnReceived*") OR (index="k8s_main" cluster="nsk-oak-prod" "namespace"=app04096 "*doPost - RequestId*") OR (index=k8s_main container_name=fraud-single-proxy-listener message="Successfully sent payload to kafka topic=order-events-avro*" contextMap.eventType="OrderLineReturnReceived")
| rename contextMap.orderId AS nefiOrderId contextMap.serviceTicketId AS nefiServiceTicketId
| rex field=eventKey "\[(?<omsOrderId>.*)\]"
| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]"
| rex "RequestId:(?<omniServiceTicketId>.*? )"
| rex "\"orderNumber\":\"(?<omniOrderId>.*?)\""
| eval appId = mvappend(container_name, app)
| eval orderId = mvappend(nefiOrderId, omsOrderId, omniOrderId)
| eval serviceTicketId = mvappend(nefiServiceTicketId, omsServiceTicketId, omniServiceTicketId)
| stats dc(_time) AS eventCount values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId
| eval timeElapsed = now() - eventTime

 

Labels (1)
1 Solution

bowesmana
SplunkTrust
SplunkTrust

This is probably a result of trailing spaces on your split by fields.

Your rex statement for serviceTicketId is greedy, in that it grabs everything

| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]

If you do a final

| eval serviceTicketId=":".serviceTicketId.":"

and the same with orderId you will see if there's a leading/trailing space.

Couple of other points - dc(_time) will count unique times, but if you have two events at the same time, you will only get 1. Should you use 'count' instead.

Also elapsedTime calculation will not work for the multivalue eventTime field.

You could do this 

...
| stats count AS eventCount min(_time) as firstEvent values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId
| eval timeElapsed = now() - firstEvent

or if you want to do elapsed time between first and last event of the particular ticket, do

...
| stats count AS eventCount range(_time) as timeElapsed values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId

Hope this helps

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

This is probably a result of trailing spaces on your split by fields.

Your rex statement for serviceTicketId is greedy, in that it grabs everything

| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]

If you do a final

| eval serviceTicketId=":".serviceTicketId.":"

and the same with orderId you will see if there's a leading/trailing space.

Couple of other points - dc(_time) will count unique times, but if you have two events at the same time, you will only get 1. Should you use 'count' instead.

Also elapsedTime calculation will not work for the multivalue eventTime field.

You could do this 

...
| stats count AS eventCount min(_time) as firstEvent values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId
| eval timeElapsed = now() - firstEvent

or if you want to do elapsed time between first and last event of the particular ticket, do

...
| stats count AS eventCount range(_time) as timeElapsed values(_time) AS eventTime values(appId) AS app BY serviceTicketId orderId

Hope this helps

 

inventsekar
SplunkTrust
SplunkTrust

Hi @bowesmana ...

>> Your rex statement for serviceTicketId is greedy, in that it grabs everything

| rex field=serviceTicketId "\[(?<omsServiceTicketId>.*)\]

 

yes, that is true, ... one should almost never use the greedy grep(".*" )..

now, then, whats ur suggestion about how we should modify this rex?.. (i think we may need the logs to modify the rex, right)

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...