Splunk Search

Using stats instead of transaction

sonicZ
Contributor

Hello,

Looking for some assistance in reconstructing my query, which is currently using | transaction with a traceId value to tie together a couple different sourcetypes/sources.

My query runs really slow, some of the sourcetype log results number in the 200million range so looking to speed it up using | stats by <traceId> instead to get the query to run faster.

First source example snippet shows the highlighted traceId and 404 response code i am looking for.

time=2021-12-11T23:59:51-07:00 time_ms=2021-12-11T23:59:51-07:00.620+ requestId=-1796576042 traceId=-1796576042 servicePath="/nationalnavigation/" remoteAddr=x.x.x.x clientIp=x.x.x.xclientAppVersion=NOT_AVAILABLE clientDeviceType=NOT_AVAILABLE app_version=- apiKey=somekey oauth_leg=2-legged authMethod=oauth apiAuth=true apiAuthPath=/ oauth_version=1.0 target_bg=default requestHost=services.timewarnercable.com requestPort=8080 requestMethod=GET requestURL="/nationalnavigation/V1/symphoni/event/tmsid/blah.com::TVNF0321206000538347?division=FTWR&lineup=15&profile=sg_v1&cacheID=959&longAdvisory=false&vodId=fort_worth&tuneToChannel=false&watchLive=true&watchOnDemand=true&rtReviewsLimit=0&includeAdult=f" requestSize=835 responseStatus=404 responseSize=420 responseTime=0.405 userAgent="Java/1.xxx" mapTEnabled="F" cClientIp="V-1|IP-x.x.x.x|SourcePort-12345|TrafficOriginID-x.x.x.x" sourcePort="12345" appleEgressEnabled="F" oauth_consumer_key="somekey" x_pi_auth_failure="-" pi_log="pi_ngxgw_access"

second source example shows the REST server logs with an exception.

2021-12-11 23:59:51,261 ERROR [qtp1647496677-7239] [-1796576042] [c.t.a.n.r.s.r.s.SymphoniRestServiceBroker.handleNnsServiceErrorHeaders:1363] An internal service error occurred: com.twc.atgw.nationalnavigation.SymphoniWebException: Event Not Found

Here's the current query i am looking to improve.

 

 

index=vap sourcetype=nns_all OR sourcetype=pi_ngxgw_access "nationalnavigation.SymphoniWebException: Event Not Found" OR "responseStatus=404"
| rex "\] \[(?<traceId>.+)\] \[c.t.a.n.r.s.r.s"
| transaction keepevicted=true by traceId
| search "nationalnavigation.SymphoniWebException: Event Not Found" AND "responseStatus=404"
| mvexpand requestURL
| search requestURL="/nationalnavigation/V1/symphoni/series/tmsproviderprogramid*" OR "/nationalnavigation/V1/symphoni/event/tmsid*"
| eval requestURLLength=len(requestURL)
| rex field=requestURL "/nationalnavigation/V1/symphoni/event/tmsid/.*\%3A\%3A(?<queryString>.+)"
| eval endpoint=case(match(requestURL,"/nationalnavigation/V1/symphoni/series/tmsproviderprogramid*"), "/nationalnavigation/V1/symphoni/series/tmsproviderprogramid",
match(requestURL,"/nationalnavigation/V1/symphoni/event/tmsid*"), "/nationalnavigation/V1/symphoni/event/tmsid",1=1,requestURL)
| rex field=queryString "(?<tmsIds>[^?]*)"
| rex field=queryString "(?<tmsProviderProgramIds>[^?]*)"
| eval assetIds=coalesce(tmsIds,tmsProviderProgramIds)
| eval assetCount=mvcount(split(assetIds,","))
| stats count AS TxnCount by endpoint

 

 

 

Labels (2)

ITWhisperer
SplunkTrust
SplunkTrust

The search commands don't make sense since the first will eliminate both your example events as neither has both these strings, and the even without the first search, the second will eliminate all the ReST log events since they don't appear to have matching strings.

The rex to extract the query string doesn't make sense since it isn't a match for your example.

The rex to extract tmsIds and tmsProviderProgramIds don't make sense since all they will do is both effectively copy the query string (which presumably has already been extracted?)

0 Karma

sonicZ
Contributor

EDIT - oops for some reason my initial post i removed the | transaction, sorry if this was misleading.

Hey ITWhisperer,

Thanks for responding, the query is working as expected it just takes forever.

So the first part of the search includes an OR so splunk finds the 404 from the event below, the lower chunk splunk finds the exception with "event not found", The transaction command combines them into an single event like this one below. These are from two different source files, with only the traceId as the unifying paramter to query on (so the 284461955)

so then the second search with the AND only looks for events that are combined with the transaction.
Here's another example of a combined event from the transaction i think the post was stripping parts of the results

results

 

 

 

 

time=2021-12-29T21:59:49+00:00 time_ms=2021-12-29T21:59:49.211+00:00 requestId=284461955 traceId=284461955 servicePath="/nationalnavigation/" remoteAddr=x.x.x.x clientIp=x.x.x.x clientAppVersion=NOT_AVAILABLE clientDeviceType=NOT_AVAILABLE app_version=- apiKey=x oauth_leg=2-legged authMethod=oauth apiAuth=true apiAuthPath=/ oauth_version=1.0 target_bg=default requestHost=services.timewarnercable.com requestPort=8080 requestMethod=GET requestURL="/nationalnavigation/V1/symphoni/event/tmsid/x.com::CCDN4200000005529014?division=BUF&lineup=354&profile=sg_v1&cacheID=439&longAdvisory=false&vodId=BUF&tuneToChannel=false&watchLive=true&watchOnDemand=true&rtReviewsLimit=0&includeAdult=true" requestSize=825 responseStatus=404 responseSize=418 responseTime=0.173 userAgent="Java/1.8.0_232" mapTEnabled="F" charterClientIp="V-1|IP-x.x.x.x|SourcePort-41098|TrafficOriginID-x.x.x.x" sourcePort="x" appleEgressEnabled="F" oauth_consumer_key="x" x_pi_auth_failure="-" pi_log="pi_ngxgw_access" 

2021-12-29 14:59:49,202 ERROR [qtp115457323-2259] [284461955] [c.t.a.n.r.s.r.s.SymphoniRestServiceBroker.handleNnsServiceErrorHeaders:1365] An internal service error occurred: 
com.twc.atgw.nationalnavigation.SymphoniWebException: Event Not Found

 

 

As to the <queryString> it basically captures everything past the "::" in the request url

so this part

CCDN4200000005529014?division=BUF&lineup=354&profile=sg_v1&cacheID=439&longAdvisory=false&vodId=BUF&tuneToChannel=false&watchLive=true&watchOnDemand=true&rtReviewsLimit=0&includeAdult=true"

Then the tsmIds is created grabbing everything from before the "?"

| rex field=queryString "(?<tmsIds>[^?]*)"

 So will grab "CCDN4200000005529014" in this example.

really the main question is on first part is the most important, i am trying to use | stats to not use | transaction as its super slow

index=vap sourcetype=nns_all OR sourcetype=pi_ngxgw_access "nationalnavigation.SymphoniWebException: Event Not Found" OR "responseStatus=404"
| rex "\] \[(?<traceId>.+)\] \[c.t.a.n.r.s.r.s"
| transaction keepevicted=true by traceId
| search "nationalnavigation.SymphoniWebException: Event Not Found" AND "responseStatus=404"

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| makeresults
| eval _raw="time=2021-12-29T21:59:49+00:00 time_ms=2021-12-29T21:59:49.211+00:00 requestId=284461955 traceId=284461955 servicePath=\"/nationalnavigation/\" remoteAddr=x.x.x.x clientIp=x.x.x.x clientAppVersion=NOT_AVAILABLE clientDeviceType=NOT_AVAILABLE app_version=- apiKey=x oauth_leg=2-legged authMethod=oauth apiAuth=true apiAuthPath=/ oauth_version=1.0 target_bg=default requestHost=services.timewarnercable.com requestPort=8080 requestMethod=GET requestURL=\"/nationalnavigation/V1/symphoni/event/tmsid/x.com::CCDN4200000005529014?division=BUF&lineup=354&profile=sg_v1&cacheID=439&longAdvisory=false&vodId=BUF&tuneToChannel=false&watchLive=true&watchOnDemand=true&rtReviewsLimit=0&includeAdult=true\" requestSize=825 responseStatus=404 responseSize=418 responseTime=0.173 userAgent=\"Java/1.8.0_232\" mapTEnabled=\"F\" charterClientIp=\"V-1|IP-x.x.x.x|SourcePort-41098|TrafficOriginID-x.x.x.x\" sourcePort=\"x\" appleEgressEnabled=\"F\" oauth_consumer_key=\"x\" x_pi_auth_failure=\"-\" pi_log=\"pi_ngxgw_access\"!2021-12-29 14:59:49,202 ERROR [qtp115457323-2259] [284461955] [c.t.a.n.r.s.r.s.SymphoniRestServiceBroker.handleNnsServiceErrorHeaders:1365] An internal service error occurred: com.twc.atgw.nationalnavigation.SymphoniWebException: Event Not Found" 
| eval event=split(_raw,"!") 
| mvexpand event
| rename event as _raw 
| extract
``` The lines above set up data as per example ```

``` Extract traceId only if match on Exception capturing enf field to signify event not found match ```
| rex "\] \[(?<traceId>.+)\] \[c.t.a.n.r.s.r.s.*nationalnavigation\.SymphoniWebException: (?<enf>Event Not Found)"
``` Gather events by traceId ```
| stats values(*) as * by traceId
``` Eliminate traceIds which don't have Event Not Found ```
| where isnotnull(enf)
| eval requestURLLength=len(requestURL)
``` Modified the following rex to use :: - you may need to change this back if your data really does contain %3A ```
| rex field=requestURL "/nationalnavigation/V1/symphoni/event/tmsid/.*::(?<queryString>.+)"
| eval endpoint=case(match(requestURL,"/nationalnavigation/V1/symphoni/series/tmsproviderprogramid*"), "/nationalnavigation/V1/symphoni/series/tmsproviderprogramid",
match(requestURL,"/nationalnavigation/V1/symphoni/event/tmsid*"), "/nationalnavigation/V1/symphoni/event/tmsid",1=1,requestURL)
``` These two rex extract exactly the same thing so either one is redundant or wrong ```
| rex field=queryString "(?<tmsIds>[^?]*)"
| rex field=queryString "(?<tmsProviderProgramIds>[^?]*)"
| eval assetIds=coalesce(tmsIds,tmsProviderProgramIds)
| eval assetCount=mvcount(split(assetIds,","))
| stats count AS TxnCount by endpoint
0 Karma

johnhuang
Motivator

Assuming that the both data sources are indexed around the same time, we can try using streamstats to filter out traceid not found in both sourcetype during say a 1m window. Also you want to optimize the regex throw out non matching events as soon as possible.

Did not test but you can try something like this:

index=vap (sourcetype=nns_all OR sourcetype=pi_ngxgw_access) ("nationalnavigation.SymphoniWebException: Event Not Found" OR "responseStatus=404")
| rex "^\d+[^\[]*\[[^\[]*\[(?<traceId>\-\d+)"
| streamstats dc(sourcetype) AS dc_sourcetype by traceId time_window=1m
| where dc_sourcetype=2

 
If you want to keep all the nns_all events:

| where dc_sourcetype=2 OR sourcetype="nns_all"

 

johnhuang
Motivator

Just want to see the size of the resultset we're talking about. Could you run this:

index=vap sourcetype=nns_all OR sourcetype=pi_ngxgw_access "nationalnavigation.SymphoniWebException: Event Not Found" OR "responseStatus=404" earliest=-1h@h
| stats count as event_count by sourcetype

0 Karma

sonicZ
Contributor

Sure here you go, quite a bit in just one hour 🙂

sonicZ_0-1640817600046.png

 

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...