Solved: Re: How to have 'count as' and also 'count by' for...

jcunningham_con · ‎06-22-2017

The following query should be intuitive enough to see what am trying to do. This query will list Success_file field values as desired, however the eval will fail. On the other hand, if I replace 'count by' with 'count as' and include values(Success_File) then the eval will work correctly, but the grouping does not and thus duplicate field values are not listed etc. Is there a way to do both so that the below eval command will succeed?

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
| rex field=_raw "bulk process for file: (?<Success_File>.*)" 
| replace "*is successful completed" with * in Success_File
| stats count by Success_File
| eventstats sum(count) as Success_Count
| eval Success_File = if(Success_Count=0, "No Success File Today", Success_File) 
| fields Success_File

DalJeanis · ‎06-22-2017

First, ALWAYS rename count using count as. It saves all kinds of mistakes.

Second, If there are no records, then count by produces nothing. You can add a record a number of ways, but in this case using appendpipe is an easy choice.

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)" 
 | replace "*is successful completed" with * in Success_File
 | stats count as mycount by Success_File

 | rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
 | appendpipe [| stats count as mycount | where cmyount==0 | eval Success_File = "No Success Files Today" ] 

 | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."
 | fields Success_File

It also seems like the first chunk could be replaced by ...

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*?) is successful(ly)? completed" 
 | stats count as mycount by Success_File

View solution in original post

DalJeanis · ‎06-22-2017

First, ALWAYS rename count using count as. It saves all kinds of mistakes.

Second, If there are no records, then count by produces nothing. You can add a record a number of ways, but in this case using appendpipe is an easy choice.

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)" 
 | replace "*is successful completed" with * in Success_File
 | stats count as mycount by Success_File

 | rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
 | appendpipe [| stats count as mycount | where cmyount==0 | eval Success_File = "No Success Files Today" ] 

 | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."
 | fields Success_File

It also seems like the first chunk could be replaced by ...

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*?) is successful(ly)? completed" 
 | stats count as mycount by Success_File

jcunningham_con · ‎06-23-2017

Thanks for the help, makes sense now. Some additional info about the problem is that there is a file processed potentially daily and currently we are doing manual checks. Instead, we are setting up a Splunk alert to provide a report for each hop of the file processing. If the file was successfully transferred; display file name. If there was a corresponding error file created, display the name of it and how many error records there were, and etc. Following is scrubbed query for two of the hops:

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyyy")
| rex field=_raw "bulk process for file: (?<esb_Success_File>.*?) is successful(ly)? completed" 
| stats count as mycount by esb_Success_File
| rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
  | appendpipe [| stats count as mycount | where mycount==0 | eval esb_Success_File = "No Success Files at Hop2 Today" ] 
  | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."

| appendcols [search index="prod" sourcetype="esb" ("Error file" AND "xxx.yyyyz*") 
| rex field=_raw "Error file (?<esb_Error_File>.*)" 
| rex mode=sed field=esb_Error_File "s/records: (\d+)/{records}/" 
| rex mode=sed field=esb_Error_File "s/out of (\d+)/{out of total}/"
| replace "/xxx/yyyy/dddd/hhhhhh/ffffff/* is created with total error {records} {out of total} records" with * in esb_Error_File
| rex field=_raw "is created with total error records: (?<esb_Error_Records>.*)"
| stats count as mycount by esb_Error_File esb_Error_Records 
| appendpipe [| stats count as mycount | where mycount==0 | eval esb_Error_File = "No Error Files at Hop2 Today", esb_Error_Records = "N/A" ]]

| appendcols [search index="prod" sourcetype="file_transfer" ("the new filename is" AND "xxx.yyyy")
| rex field=_raw "the new filename is (?<FT_Success_File>xxx.*)"
| stats count as mycount by FT_Success_File 
| appendpipe [| stats count as mycount | where mycount==0 | eval FT_Success_File = "File Not Sent from Hop1 during this time"]]

| appendcols [search index="prod" sourcetype="file_transfer" ("xxx.yyyy" AND "Successful transfer")
| rex field=_raw "File=/xxxx/yyyyyyy/ddd.fff.vvv.(?<FT_Transfer_Status>xxx.yyyy.*\n.*)"
| stats count as mycount by FT_Transfer_Status
| appendpipe [| stats count as mycount | where mycount==0 | eval FT_Transfer_Status = "File Not Sent from Hop1 during this time"]]

| fields FT_Success_File FT_Transfer_Status esb_Success_File esb_Error_File esb_Error_Records

woodcock · ‎06-23-2017

So are you done or looking for more help (a different answer)?

jcunningham_con · ‎06-23-2017

It's done/working. I am a noob here, meant to post as a comment to previous answer 🙂

woodcock · ‎06-23-2017

Converted.

woodcock · ‎06-22-2017

I totally do not get the problem but maybe the problem is that you have values for Success_File that sometimes don't exist and you would like it to always exists but to show "count=0" instead of disappearing. Is that it? If not, then you need to back way up and explain it in a different way, because I don't get it.

DalJeanis · ‎06-22-2017

The way I read the question and the code, @woodcock and @somesoni2, OP has different success file names, there will only ever be one count per file, and OP wants a single record to say "no files" if there are none.

somesoni2 · ‎06-22-2017

After stats command, you'll have different Success_File and corresponding count.

Success_File count
..............................
file1   count1
file2   count2

The eventstats will add a column with sum of all count values and will never be 0. What exactly is your requirement here? Do you want to show a rows with Success_File="No Success File Today" with count=0 when your search doesn't return any data (no files being processed)? If yes, try this

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)is successful completed" 
 | stats count by Success_File
 | appendpipe[ | stats count | where count=0 | eval Success_File="No Success File Today"]

How to have 'count as' and also 'count by' for one search

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!