Splunk Search

How to have 'count as' and also 'count by' for one search

jcunningham_con
Explorer

The following query should be intuitive enough to see what am trying to do. This query will list Success_file field values as desired, however the eval will fail. On the other hand, if I replace 'count by' with 'count as' and include values(Success_File) then the eval will work correctly, but the grouping does not and thus duplicate field values are not listed etc. Is there a way to do both so that the below eval command will succeed?

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
| rex field=_raw "bulk process for file: (?<Success_File>.*)" 
| replace "*is successful completed" with * in Success_File
| stats count by Success_File
| eventstats sum(count) as Success_Count
| eval Success_File = if(Success_Count=0, "No Success File Today", Success_File) 
| fields Success_File 
Tags (1)
0 Karma
1 Solution

DalJeanis
Legend

First, ALWAYS rename count using count as. It saves all kinds of mistakes.

Second, If there are no records, then count by produces nothing. You can add a record a number of ways, but in this case using appendpipe is an easy choice.

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)" 
 | replace "*is successful completed" with * in Success_File
 | stats count as mycount by Success_File

 | rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
 | appendpipe [| stats count as mycount | where cmyount==0 | eval Success_File = "No Success Files Today" ] 

 | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."
 | fields Success_File 

It also seems like the first chunk could be replaced by ...

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*?) is successful(ly)? completed" 
 | stats count as mycount by Success_File

View solution in original post

DalJeanis
Legend

First, ALWAYS rename count using count as. It saves all kinds of mistakes.

Second, If there are no records, then count by produces nothing. You can add a record a number of ways, but in this case using appendpipe is an easy choice.

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)" 
 | replace "*is successful completed" with * in Success_File
 | stats count as mycount by Success_File

 | rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
 | appendpipe [| stats count as mycount | where cmyount==0 | eval Success_File = "No Success Files Today" ] 

 | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."
 | fields Success_File 

It also seems like the first chunk could be replaced by ...

 index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*?) is successful(ly)? completed" 
 | stats count as mycount by Success_File

jcunningham_con
Explorer

Thanks for the help, makes sense now. Some additional info about the problem is that there is a file processed potentially daily and currently we are doing manual checks. Instead, we are setting up a Splunk alert to provide a report for each hop of the file processing. If the file was successfully transferred; display file name. If there was a corresponding error file created, display the name of it and how many error records there were, and etc. Following is scrubbed query for two of the hops:

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyyy")
| rex field=_raw "bulk process for file: (?<esb_Success_File>.*?) is successful(ly)? completed" 
| stats count as mycount by esb_Success_File
| rename COMMENT as "Now add a record for no files, but only if there are no records to this point."
  | appendpipe [| stats count as mycount | where mycount==0 | eval esb_Success_File = "No Success Files at Hop2 Today" ] 
  | rename COMMENT as "... and eliminate everything but the file names, since there can be only one."

| appendcols [search index="prod" sourcetype="esb" ("Error file" AND "xxx.yyyyz*") 
| rex field=_raw "Error file (?<esb_Error_File>.*)" 
| rex mode=sed field=esb_Error_File "s/records: (\d+)/{records}/" 
| rex mode=sed field=esb_Error_File "s/out of (\d+)/{out of total}/"
| replace "/xxx/yyyy/dddd/hhhhhh/ffffff/* is created with total error {records} {out of total} records" with * in esb_Error_File
| rex field=_raw "is created with total error records: (?<esb_Error_Records>.*)"
| stats count as mycount by esb_Error_File esb_Error_Records 
| appendpipe [| stats count as mycount | where mycount==0 | eval esb_Error_File = "No Error Files at Hop2 Today", esb_Error_Records = "N/A" ]]

| appendcols [search index="prod" sourcetype="file_transfer" ("the new filename is" AND "xxx.yyyy")
| rex field=_raw "the new filename is (?<FT_Success_File>xxx.*)"
| stats count as mycount by FT_Success_File 
| appendpipe [| stats count as mycount | where mycount==0 | eval FT_Success_File = "File Not Sent from Hop1 during this time"]]

| appendcols [search index="prod" sourcetype="file_transfer" ("xxx.yyyy" AND "Successful transfer")
| rex field=_raw "File=/xxxx/yyyyyyy/ddd.fff.vvv.(?<FT_Transfer_Status>xxx.yyyy.*\n.*)"
| stats count as mycount by FT_Transfer_Status
| appendpipe [| stats count as mycount | where mycount==0 | eval FT_Transfer_Status = "File Not Sent from Hop1 during this time"]]

| fields FT_Success_File FT_Transfer_Status esb_Success_File esb_Error_File esb_Error_Records 
0 Karma

woodcock
Esteemed Legend

So are you done or looking for more help (a different answer)?

0 Karma

jcunningham_con
Explorer

It's done/working. I am a noob here, meant to post as a comment to previous answer 🙂

woodcock
Esteemed Legend

Converted.

woodcock
Esteemed Legend

I totally do not get the problem but maybe the problem is that you have values for Success_File that sometimes don't exist and you would like it to always exists but to show "count=0" instead of disappearing. Is that it? If not, then you need to back way up and explain it in a different way, because I don't get it.

0 Karma

DalJeanis
Legend

The way I read the question and the code, @woodcock and @somesoni2, OP has different success file names, there will only ever be one count per file, and OP wants a single record to say "no files" if there are none.

0 Karma

somesoni2
Revered Legend

After stats command, you'll have different Success_File and corresponding count.

Success_File count
..............................
file1   count1
file2   count2

The eventstats will add a column with sum of all count values and will never be 0. What exactly is your requirement here? Do you want to show a rows with Success_File="No Success File Today" with count=0 when your search doesn't return any data (no files being processed)? If yes, try this

index=prod sourcetype=esb ("bulk process for file" AND "xxx.yyy")
 | rex field=_raw "bulk process for file: (?<Success_File>.*)is successful completed" 
 | stats count by Success_File
 | appendpipe[ | stats count | where count=0 | eval Success_File="No Success File Today"]
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...