Solved: Why is my search with multiple rex commands result...

pinalshah341 · ‎10-29-2015

I have a search:

index="production"  [search source="port-120" "Decision Received: REJECT"| fields x_reqid] | rex field=_raw "Req Id:(?<req_id>.*)" | rex field=_raw "cust ID :(?<cust_id>.*)" | table x_reqid,req_id,cust_id | sort -_time

My log statement:

x-reqid=247-64d-4c4-5d2043 Decision Received: REJECT
x-reqid=247-64d-4c4-5d2043 Req Id:4461015602805000002477
x-reqid=247-64d-4c4-5d2043 cust ID : abc@g.com

I want a table output with three colums x_reqid, req_id and cust_id. However, the above query is giving me repetative x_reqid and not distinct results. Please help

Richfez · ‎10-30-2015

The editor may have eaten some characters there, and what is it you are actually wanting to do? To create a list of rejects, all bundled together, I'd do this:

index="production" source="port-120" | 
rex field=_raw "Decision Received:\s(?<decision>.*)
rex field=_raw "Req Id:(?<req_id>\d*)" | 
rex field=_raw "cust ID : (?<cust_id>.*)" |
transaction x-reqid | 
search decision="REJECT" |
table x_reqid,req_id,cust_id | 
sort -_time

Take each line, get your fields out. Then create a transaction out of each one, combining rows where x-reqid is equal and turning those sets of rows into a single event. Then filter that to only include the REJECT ones, then fiddle with your table.

Try that and see if it gives you a fundamentally useful output that you can build on.

BTW, If you can time-bound the beginning and the ending of that little transaction (like you know from the first line to the last in each group will never be more than a couple of minutes), I'd do so with transaction maxspan=5m x-reqid or something like that. Makes it VASTLY more efficient. Even maxspan=1h or even 12h would help a lot, depending on what time frames you run over.

View solution in original post

Runals · ‎10-30-2015

So you could do a transaction command to turn all of those into one field based on the common x-reqid field but I think that will add unnecessary overhead depending on the data volume and timeframe of your searches (no offense to rich7177). The table command is simply listing out the fields as it sees them which is why you are getting the duplicates (3 by your example). I'd probably go with stats. The rex commands you have seem off but that might be a by product of the copy/paste job and not tagging those bits of your question as 'code' .

I might do something like

index="production" [search index=production source="port-120" "Decision Received: REJECT"| fields x_reqid] | rex field=_raw "Req Id:(?<req_id>\d+)" | rex field=_raw "cust ID : (?<cust_id>.+)" | stats min(_time) as _time by x_reqid,req_id,cust_id | sort -_time

Actually if the decision field is at all important and depending on overall use case I'd bake your fields into the backend so they wouldn't have to be extracted at search time. At any rate you could create an interactive dashboard pretty easily with the data if there are other states for Decision Received. For what I'll put below you loose some efficiency but /shrug

index="production" source="port-120"  | rex field=_raw "Req Id:(?<req_id>\d+)" | rex field=_raw "cust ID : (?<cust_id>.+)" | rex "Decision Received: (?<decision>\w+)" | stats min(_time) as _time by x_reqid,req_id,cust_id decision | where decision="REJECT" | sort -_time

The idea would be to replace the | where decision="REJECT" bit to a token that is supplied by a drop down in a dashboard.

Richfez · ‎11-16-2015

My brain heads to transaction-land too soon. Which is a shame, because I think the stats you did is a better answer in this case - certainly faster and more efficient, possibly better output as well.

Richfez · ‎10-30-2015

The editor may have eaten some characters there, and what is it you are actually wanting to do? To create a list of rejects, all bundled together, I'd do this:

index="production" source="port-120" | 
rex field=_raw "Decision Received:\s(?<decision>.*)
rex field=_raw "Req Id:(?<req_id>\d*)" | 
rex field=_raw "cust ID : (?<cust_id>.*)" |
transaction x-reqid | 
search decision="REJECT" |
table x_reqid,req_id,cust_id | 
sort -_time

Take each line, get your fields out. Then create a transaction out of each one, combining rows where x-reqid is equal and turning those sets of rows into a single event. Then filter that to only include the REJECT ones, then fiddle with your table.

Try that and see if it gives you a fundamentally useful output that you can build on.

BTW, If you can time-bound the beginning and the ending of that little transaction (like you know from the first line to the last in each group will never be more than a couple of minutes), I'd do so with transaction maxspan=5m x-reqid or something like that. Makes it VASTLY more efficient. Even maxspan=1h or even 12h would help a lot, depending on what time frames you run over.

pinalshah341 · ‎11-02-2015

thank you for your answer -
I just used transaction command in my existing query and it worked-

index="production"  [search source="port-120" "Decision Received: REJECT"| fields x_reqid] "cust ID" OR "Req Id" | rex field=_raw "Req Id:(?.*)" | rex field=_raw "cust ID :(?.*)" |transaction x_reqid | table x_reqid,req_id,cust_id | sort -_time

Why is my search with multiple rex commands resulting in repetitive values instead of distinct results?

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!