Hi,
I'm in Splunk since August after 20 years working in SQL, a lot of new things and I need help.
I've a daily cron job that is executed writing in a csv. Something like this is an example of a data generate:
index=ans sourcetype="my_source" earliest=1
| stats max(DateLastUpdate) as DateLastUpdate by Ticket
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| table SavedDay Ticket DateLastUpdate
When it was executed in 20201019 (YYYYMMDD format) it wrote
SavedDay | Ticket | DateLastUpdate |
20201019 | A1002 | 20201019 |
20201019 | AD00X | 20200920 |
... | ... | ... |
20201019 | ADVCX | 20200520 |
20201019 | LDS0X | 20200415 |
20201019 | SD10X | 20200118 |
in 20201022, it wrote
SavedDay | Ticket | DateLastUpdate |
20201022 | PPU0X | 20201022 |
20201022 | SD10X | 20201021 |
20201022 | XX438 | 20201021 |
20201022 | 44$5% | 20201020 |
20201022 | A1002 | 20201019 |
20201022 | AD00X | 20200920 |
... | ... | ... |
20201022 | ADVCX | 20200520 |
20201022 | LDS0X | 20200415 |
The job failed for 20201021 and 20201022, now I would like to re-execute it to generate the changes, but:
index=ans sourcetype="my_source" earliest=1
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| dedup SavedDay
| table SavedDay
| join type=left SavedDay [| inputlookup "my_lookup.csv"
| dedup SavedDay
| eval ExistsDay = 1]
| where isnull(ExistsDay)| table SavedDay
| join type=inner SavedDay [search index=ans sourcetype="my_source" earliest=1
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| ...]
It is only returning SavedDay for values 20201020 and 20201021 and not the rest of the historical data changes.
Another solution doing a "cartesian" works, but I don't like it because it isn't efficient in time and cost (around a 1Gb for regenerate 2 days).
index=ans sourcetype="my_source" earliest=1
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| dedup SavedDay
| table SavedDay
| join type=left max=0 [| inputlookup "my_lookup.csv"
| dedup SavedDay
| eval ExistsDay = 1]
| where isnull(ExistsDay)| table SavedDay
| join type=inner SavedDay [search index=ans sourcetype="my_source" earliest=1
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| ...]
It is only returning SavedDay for values 20201020 and 20201021 and not the rest of the historical data changes.
Another solution doing a "cartesian" works, but I don't like it because it's inefficient in time and cost (around 8 minute and 1Gb for regenerate 2 days).
index=ans sourcetype="my_source" earliest=1
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| dedup SavedDay
| table SavedDay
| join type=left max=0 [| inputlookup "my_lookup.csv"
| dedup SavedDay
| eval ExistsDay = 1]
| where isnull(ExistsDay)| table SavedDay
| join type=inner SavedDay [search index=ans sourcetype="my_source" earliest=1
| eval SavedDay=strftime(relative_time(now(),"-1d@d"),"%Y%m%d")
| ...]
Is there a way to do something similar to a loop execution (or sequential execution), "Date_missed >= Date_of_sourcetype" with subsearch or execute python?
Could I execute search or reports passing parameters values?
Thanks!!
Javier
I've found the solution using function "map" working as a loop interaction.
| makeresults count=5000
| streamstats count as rownum
| eval SavedDay=strftime(now(),"%Y%m%d")
| eval InitialDay = "20200101"
| eval DiffDay = "-".rownum."d@d"
| eval SavedDay=strftime(relative_time(now(),DiffDay),"%Y%m%d")
| where SavedDay >= InitialDay
| join type=left SavedDay[| inputlookup "my_lookup.csv"
| dedup SavedDay
| eval ExistsDay = 1]
| where isnull(ExistsDay )
| rename SavedDay
| where SavedDay >= "20190101"
and SavedDay < strftime(now(),"%Y%m%d")
| table SavedDay
| map search="search index=ans sourcetype=\"my_source\" earliest=1
| eval year = strftime(strptime(CreationDate,\"%Y-%m-%d %H:%M:%S\"),\"%Y\") | where year>=2020
| eval Day=$SavedDay$
| where Day>= strftime(strptime(UpdateDate,\"%Y-%m-%d %H:%M:%S\"),\"%Y%m%d\")
| stats max(UpdateDate) as UpdateDatef by Ticket Day
| join type=left Ticket Day [| inputlookup \"my_lookup.csv\"
| eval IsIn = 1]
| where isnull(IsIn)
.
.
.
| table Ticket Day CreationDate UpdateDate"
| table Ticket Day CreationDate UpdateDate
A little bit annoying because the double-quote and escape, but it works!
Javier
I've found the solution using function "map" working as a loop interaction.
| makeresults count=5000
| streamstats count as rownum
| eval SavedDay=strftime(now(),"%Y%m%d")
| eval InitialDay = "20200101"
| eval DiffDay = "-".rownum."d@d"
| eval SavedDay=strftime(relative_time(now(),DiffDay),"%Y%m%d")
| where SavedDay >= InitialDay
| join type=left SavedDay[| inputlookup "my_lookup.csv"
| dedup SavedDay
| eval ExistsDay = 1]
| where isnull(ExistsDay )
| rename SavedDay
| where SavedDay >= "20190101"
and SavedDay < strftime(now(),"%Y%m%d")
| table SavedDay
| map search="search index=ans sourcetype=\"my_source\" earliest=1
| eval year = strftime(strptime(CreationDate,\"%Y-%m-%d %H:%M:%S\"),\"%Y\") | where year>=2020
| eval Day=$SavedDay$
| where Day>= strftime(strptime(UpdateDate,\"%Y-%m-%d %H:%M:%S\"),\"%Y%m%d\")
| stats max(UpdateDate) as UpdateDatef by Ticket Day
| join type=left Ticket Day [| inputlookup \"my_lookup.csv\"
| eval IsIn = 1]
| where isnull(IsIn)
.
.
.
| table Ticket Day CreationDate UpdateDate"
| table Ticket Day CreationDate UpdateDate
A little bit annoying because the double-quote and escape, but it works!
Javier