Hi,
I have written a script which runs for every after 1 hr, here the 24 hr window is from 07am to next day 06:00am
My requirement is to provide a monthly count of the report but should only consider the last log file 06:00 am which contains all the updated information for the day
index=XX host=XX source="abc_YYYYMMDD.csv"
| dedup _raw
| fields source A B C | replace abc*.csv WITH * in source
| eval source=strftime(strptime(source,"%Y%m%d"),"%d-%B-%Y")
| eval jobs=A+B+C
| dedup jobs
| stats count(jobs) by source - this display all the events but I want to only the last log file count.
It is not clear what data you are dealing with. Is it that you just want the last event in the log for each A B C combination?
| stats last(jobs) as jobs by source A B C
I am looking for the total number of record for the last log file created.
Example:
07:00 am - log_20210801.csv --count = 14000
08:00 am - log_20210801.csv -- count = 14500 (file will overwrite)
and so on.. until next day
06:00am -log_20210801.csv -- count = 17000 (file will overwrite and contain the final update events for
the day)
07:00am -log_20210802.csv -- count 16000 until
06:00am -log_20210802.csv -- count 20000 (file will overwrite)
here number of records in last log files are 17000 and 20000 need this value only and using the logic will perform search for last 30 days to create a bar diagram to visually display the trend
If you did this
index=XX host=XX source="abc_20210801.csv"
| stats count
Would you get 14000 at 7am (or just after at least) and 14500 at 8am, or 14000 at 7am and 28500 at 8am?
Assuming you want more than one source to be considered, what do you get if you do this
index=XX host=XX source="abc_*.csv"
| stats count by source
the query displays the total count for all the runs
example:
1Aug - 07am (count = 14000), 08 am (count=15000), ... Next day 06:00 am(18000)
2 Aug - 07am (count = 11000), 08 am (count=12500), ... Next day 06:00 am(19500)
... so on until
30th Aug - 07am (count = 2000), 08 am (count=10000), ... Next day 06:00 am(12500)
My requirement is to only get the last count of the day i.e
1 Aug (18000) , 2 Aug (19500) ... 30th Aug (12500)
What is the search you are using to get these results?
index=XXX host=XXX sourcetype=XXX source=<sourcefilepath>/InformationtoSplunk*
| dedup _raw
| fields source A B C D
| eval Time=source
| replace <sourcefilepath>/InformationtoSplunk* .csv WITH * in Time
| search Time!="*_*"
| eval Time_tmp=strftime(strptime(Time,"%Y%m%d"),"%Y%m")
| eval cur=strftime(now(),"%Y%m")
| where Time_tmp=cur
| eval Time=strptime(Time,"%Y%m%d")+1
| eval _time=Time
| eval jobs=B+C+D
| dedup jobs
| timechart span=1d count(jobs)
| eval Threshold=20000
I don't understand how this search gives you the results you have - can you share some raw events from you csv files?