Solved: Re: How to have 'count as' and also 'count by' for...

jcunningham_con · ‎06-22-2017

The following query should be intuitive enough to see what am trying to do. This query will list Success_file field values as desired, however the eval will fail. On the other hand, if I replace 'count by' with 'count as' and include values(Success_File) then the eval will work correctly, but the grouping does not and thus duplicate field values are not listed etc. Is there a way to do both so that the below eval command will succeed?

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
| rex field=_raw "bulk process for file: (?<Success_File>.*)" 
| replace "*is successful completed" with * in Success_File
| stats count by Success_File
| eventstats sum(count) as Success_Count
| eval Success_File = if(Success_Count=0, "No Success File Today", Success_File) 
| fields Success_File

DalJeanis · ‎06-22-2017

First, ALWAYS rename count using count as. It saves all kinds of mistakes.

Second, If there are no records, then count by produces nothing. You can add a record a number of ways, but in this case using appendpipe is an easy choice.

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)" 
 | replace "*is successful completed" with * in Success_File
 | stats count as mycount by Success_File

 | rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
 | appendpipe [| stats count as mycount | where cmyount==0 | eval Success_File = "No Success Files Today" ] 

 | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."
 | fields Success_File

It also seems like the first chunk could be replaced by ...

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*?) is successful(ly)? completed" 
 | stats count as mycount by Success_File

View solution in original post

DalJeanis · ‎06-22-2017

First, ALWAYS rename count using count as. It saves all kinds of mistakes.

Second, If there are no records, then count by produces nothing. You can add a record a number of ways, but in this case using appendpipe is an easy choice.

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)" 
 | replace "*is successful completed" with * in Success_File
 | stats count as mycount by Success_File

 | rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
 | appendpipe [| stats count as mycount | where cmyount==0 | eval Success_File = "No Success Files Today" ] 

 | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."
 | fields Success_File

It also seems like the first chunk could be replaced by ...

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*?) is successful(ly)? completed" 
 | stats count as mycount by Success_File

jcunningham_con · ‎06-23-2017

Thanks for the help, makes sense now. Some additional info about the problem is that there is a file processed potentially daily and currently we are doing manual checks. Instead, we are setting up a Splunk alert to provide a report for each hop of the file processing. If the file was successfully transferred; display file name. If there was a corresponding error file created, display the name of it and how many error records there were, and etc. Following is scrubbed query for two of the hops:

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyyy")
| rex field=_raw "bulk process for file: (?<esb_Success_File>.*?) is successful(ly)? completed" 
| stats count as mycount by esb_Success_File
| rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
  | appendpipe [| stats count as mycount | where mycount==0 | eval esb_Success_File = "No Success Files at Hop2 Today" ] 
  | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."

| appendcols [search index="prod" sourcetype="esb" ("Error file" AND "xxx.yyyyz*") 
| rex field=_raw "Error file (?<esb_Error_File>.*)" 
| rex mode=sed field=esb_Error_File "s/records: (\d+)/{records}/" 
| rex mode=sed field=esb_Error_File "s/out of (\d+)/{out of total}/"
| replace "/xxx/yyyy/dddd/hhhhhh/ffffff/* is created with total error {records} {out of total} records" with * in esb_Error_File
| rex field=_raw "is created with total error records: (?<esb_Error_Records>.*)"
| stats count as mycount by esb_Error_File esb_Error_Records 
| appendpipe [| stats count as mycount | where mycount==0 | eval esb_Error_File = "No Error Files at Hop2 Today", esb_Error_Records = "N/A" ]]

| appendcols [search index="prod" sourcetype="file_transfer" ("the new filename is" AND "xxx.yyyy")
| rex field=_raw "the new filename is (?<FT_Success_File>xxx.*)"
| stats count as mycount by FT_Success_File 
| appendpipe [| stats count as mycount | where mycount==0 | eval FT_Success_File = "File Not Sent from Hop1 during this time"]]

| appendcols [search index="prod" sourcetype="file_transfer" ("xxx.yyyy" AND "Successful transfer")
| rex field=_raw "File=/xxxx/yyyyyyy/ddd.fff.vvv.(?<FT_Transfer_Status>xxx.yyyy.*\n.*)"
| stats count as mycount by FT_Transfer_Status
| appendpipe [| stats count as mycount | where mycount==0 | eval FT_Transfer_Status = "File Not Sent from Hop1 during this time"]]

| fields FT_Success_File FT_Transfer_Status esb_Success_File esb_Error_File esb_Error_Records

woodcock · ‎06-23-2017

So are you done or looking for more help (a different answer)?

jcunningham_con · ‎06-23-2017

It's done/working. I am a noob here, meant to post as a comment to previous answer 🙂

woodcock · ‎06-23-2017

Converted.

woodcock · ‎06-22-2017

I totally do not get the problem but maybe the problem is that you have values for Success_File that sometimes don't exist and you would like it to always exists but to show "count=0" instead of disappearing. Is that it? If not, then you need to back way up and explain it in a different way, because I don't get it.

DalJeanis · ‎06-22-2017

The way I read the question and the code, @woodcock and @somesoni2, OP has different success file names, there will only ever be one count per file, and OP wants a single record to say "no files" if there are none.

somesoni2 · ‎06-22-2017

After stats command, you'll have different Success_File and corresponding count.

Success_File count
..............................
file1   count1
file2   count2

The eventstats will add a column with sum of all count values and will never be 0. What exactly is your requirement here? Do you want to show a rows with Success_File="No Success File Today" with count=0 when your search doesn't return any data (no files being processed)? If yes, try this

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)is successful completed" 
 | stats count by Success_File
 | appendpipe[ | stats count | where count=0 | eval Success_File="No Success File Today"]

How to have 'count as' and also 'count by' for one search

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor