Splunk Search

Strip datetime and group by filename

raghul725
Explorer

Hello,


I have the following logs from Cron


File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @08:00

File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @10:15

File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @11:00

 

File successfully sent - AllOpenItemsMaint_YYYYMMDD_HR-MM.csv.zip @07:00

File successfully sent - AllOpenItemsMaint_YYYYMMDD_HR-MM.csv.zip @09:00

File successfully sent - AllOpenItemsMaint_YYYYMMDD_HR-MM.csv.zip @13:00


File successfully sent - AllOpenItemsCOUNTRYNAME_YYYYMMDD_HR-MM.csv.zip @12:00

File successfully sent - AllOpenItemsCOUNTRYNAME_YYYYMMDD_HR-MM.csv.zip @14:30

File successfully sent - AllOpenItemsCOUNTRYNAME_YYYYMMDD_HR-MM.csv.zip @17:20

 

 

I am trying to group the files based on "AllOpenItems" string for last 24 hours and tried the below.

 

index=* namespace=*
"File successfully sent -"| rex "File successfully sent - AllOpenItems(?<reptype>\w+)"|stats values(reptype) as ReportType by reptype


The problem with the above is, I am unable to strip the Date&time the file name, so it won't group as per my requirement.

Could someone assist please?

Labels (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

See if this helps.  It breaks out the filename and time separately.

 

index=* namespace=*
"File successfully sent -"| rex "File successfully sent - AllOpenItems(?<reptype>\S+)"
| rex field=reptype "_(?<time>[^\.]+)"
| eval epoch=strptime(time,"%Y%M%D_%H-%M")
|stats values(reptype) as ReportType by epoch

 

---
If this reply helps you, Karma would be appreciated.
0 Karma

raghul725
Explorer

Thanks

 

Thanks,  But as soon as I add "by epoch", search says no event returned.

 

I guess group by epoch does apply here, because epoch is defined by - %Y%M%D_%H-%M"

And none of the files can be grouped based on that?

0 Karma

richgalloway
SplunkTrust
SplunkTrust
Before we go further down the wrong rabbit hole, what exactly do you want the output to look like?
---
If this reply helps you, Karma would be appreciated.
0 Karma

raghul725
Explorer

Sure, Group by file name without date&time  (Example - AllOpenItemsPT, AllOpenItemsMaint etc) and display the count.
But thinking about it, it may be useful to display the date&time as well, but then my previous group by won't work

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Depending on what you mean by "file name without date&time", this may help.  It ignores the part of the file name after (and including) the first underscore character.

index=* namespace=* "File successfully sent -"
| rex "File successfully sent - AllOpenItems(?<reptype>[^_]+)"
| stats count as ReportCount by reptype

   

---
If this reply helps you, Karma would be appreciated.
0 Karma

raghul725
Explorer

Hello Again,

 

As soon as I group "by reptype", no events are returned.

Any other suggestions please? 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

It works fine in this run-anywhere example.

| makeresults annotate=t| eval data="File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @08:00|File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @10:15|File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @11:00|File successfully sent - AllOpenItemsMaint_YYYYMMDD_HR-MM.csv.zip @07:00|File successfully sent - AllOpenItemsMaint_YYYYMMDD_HR-MM.csv.zip @09:00|File successfully sent - AllOpenItemsMaint_YYYYMMDD_HR-MM.csv.zip @13:00|File successfully sent - AllOpenItemsCOUNTRYNAME_YYYYMMDD_HR-MM.csv.zip @12:00|File successfully sent - AllOpenItemsCOUNTRYNAME_YYYYMMDD_HR-MM.csv.zip @14:30|File successfully sent - AllOpenItemsCOUNTRYNAME_YYYYMMDD_HR-MM.csv.zip @17:20" | eval data=split(data,"|") | mvexpand data
`comment("Above sets up test data")`
| rex field=data "File successfully sent - AllOpenItems(?<reptype>[^_]+)"
| stats count as ReportCount by reptype
---
If this reply helps you, Karma would be appreciated.
0 Karma

raghul725
Explorer

I see what the confusion could be. Sorry I should have been clear.

When I said 

File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @08:00

File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @10:15

File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip @11:00

 

 

The time after @ is not in the logs, that was just an example stating it runs around that time.

So to put it simply the logs would like 

File successfully sent - AllOpenItemsPT_YYYYMMDD_HR-MM.csv.zip

Would your suggestion still work please?

0 Karma

richgalloway
SplunkTrust
SplunkTrust
Removing the times changes nothing. Did you try my test query?
---
If this reply helps you, Karma would be appreciated.
0 Karma

raghul725
Explorer

Yes I tried your test query it works

And tried it immediately without time after @& I thought it would work and it did.

OK now,

If I run the query against my logs using search i.e. by not passing the log lines via the query, I see 31 events for yesterday, but statistics (0).

As usual if I remove "by reptype", returns the count as 31 under statistics.

 

 

 

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust
The stats command will return no results if the field used in the 'by' clause is empty. Check that the field in the 'by' clause is spelled correctly and that it has values.
---
If this reply helps you, Karma would be appreciated.
0 Karma

ChrisH
Explorer

Assuming that the strings above are source/filename (or another field) within the events, would the below work for you?  This should at least give you fields from the string that you can use with your stats command.

| makeresults
| eval source="File successfully sent - AllOpenItemsPT_20200610_08-00.csv.zip @08:00"
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsPT_20200610_10-15.csv.zip @10:15"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsPT_20200610_11-00.csv.zip @11:00"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsMaint_20200610_07-00.csv.zip @07:00"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsMaint_20200610_09-00.csv.zip @09:00"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsMaint_20200610_13-00.csv.zip @13:00"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsUS_20200610_12-00.csv.zip @12:00"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsUS_20200610_14-30.csv.zip @14:30"]
| append 
    [| makeresults 
    |  eval source="File successfully sent - AllOpenItemsUS_20200610_17-20.csv.zip @17:20"]
| rex field=source "(?<prefix>File successfully sent - AllOpenItems)((?<description>[^_]*)_(?<dt>[^_]*)_(?<tm>\d{2}-\d{2}))"
| table _time, prefix, description, dt, tm
| stats values(description) as description

 

0 Karma

mayurr98
Super Champion

could you please tell us how your output should look like from those sample logs.

0 Karma

raghul725
Explorer

Sure, Group by file name without date&time  (Example - AllOpenItemsPT, AllOpenItemsMaint etc) and display the count.
But thinking about it, it may be useful to display the date&time as well, but then my previous group by won't work

 

Tags (1)
0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...