We are trying to identify how much of our data is impacted by the latest timestamp bug. I was wondering if there was a query that used regex to search _raw for events that have 2 digit years. This will greatly help us analyze the risk of doing a last minute production upgrade...
Hi jordanking1992,
I posted that in the slack channel #splunky2k
channel:
index=* TERM(19)
| regex _raw="[\\\/\|-](19)"
| rex "(?<myField>[^\s]+19)"
| search myField!="*2019*"
| stats count by index sourcetype
It later got this little enhancement:
index=* TERM(19)
| eval sample=substr(_raw,0,128), search="index=".index." sourcetype=".sourcetype." TERM(19)"
| regex sample="(((?:^|\D)\d{1,2}[-\/]\d{1,2}[-\/]19[^\d])|((?:^|\D)19[-\/]\d{1,2}[-\/]\d{1,2}[^\d])|((?:^|\D)\d{1,2}\s[-\/]\s\d{1,2}\s[-\/]\s19[^\d])|((?:^|\D)19\s[-\/]\s\d{1,2}\s[-\/]\s\d{1,2}[^\d])|((?:^|\D)([a-zA-Z]{3}[- \/]+\d{1,2}[- \/]+19[^:\d]))|((?:^|\D)19[- \/][a-zA-Z]{3}[- \/]\d{1,2}[^:\d])|((?:^|\D)\d{1,2}[- \/]+[a-zA-Z]{3}[- \/]+19[^:\d]))"
| stats count last(sample) as sample by search
Please be aware that this is a very hungry, resource intensive search!
Hope this helps ...
cheers, MuS
UPDATE modifications to the regex and the substr()
uses the first 128 characters of the event.
You can use this search to find potentially problematic events. DISClAIMER: This is NOT a guarantee because we have no way to tell with SPL whether Indexers are using datetime.xml
or proper Magic 6
settings. It will show you events that IF the indexers are using datetime.xml
, will be broken without the fix.
index="*" AND sourcetype="*" AND timestartpos="*" earliest=-7d latest=now
| dedup punct sourcetype index
| eval timestr=substr(_raw, timestartpos+1, timeendpos-timestartpos)
| regex timestr="(((?:^|\D)\d{1,2}[-\/]\d{1,2}[-\/]19[^\d])|((?:^|\D)19[-\/]\d{1,2}[-\/]\d{1,2}[^\d])|((?:^|\D)\d{1,2}\s[-\/]\s\d{1,2}\s[-\/]\s19[^\d])|((?:^|\D)19\s[-\/]\s\d{1,2}\s[-\/]\s\d{1,2}[^\d])|((?:^|\D)([a-zA-Z]{3}[- \/]+\d{1,2}[- \/]+19[^:\d]))|((?:^|\D)19[- \/][a-zA-Z]{3}[- \/]\d{1,2}[^:\d])|((?:^|\D)\d{1,2}[- \/]+[a-zA-Z]{3}[- \/]+19[^:\d]))"
| table punct sourcetype index timestr time*pos _time _raw time*
| stats list(*) AS * BY index sourcetype
If this search returns nothing, then you have nothing to fix. Do note that this search will return the same results BEFORE and AFTER you deploy the fix. It only shows your potential risk, not your actual.
Hi jordanking1992,
I posted that in the slack channel #splunky2k
channel:
index=* TERM(19)
| regex _raw="[\\\/\|-](19)"
| rex "(?<myField>[^\s]+19)"
| search myField!="*2019*"
| stats count by index sourcetype
It later got this little enhancement:
index=* TERM(19)
| eval sample=substr(_raw,0,128), search="index=".index." sourcetype=".sourcetype." TERM(19)"
| regex sample="(((?:^|\D)\d{1,2}[-\/]\d{1,2}[-\/]19[^\d])|((?:^|\D)19[-\/]\d{1,2}[-\/]\d{1,2}[^\d])|((?:^|\D)\d{1,2}\s[-\/]\s\d{1,2}\s[-\/]\s19[^\d])|((?:^|\D)19\s[-\/]\s\d{1,2}\s[-\/]\s\d{1,2}[^\d])|((?:^|\D)([a-zA-Z]{3}[- \/]+\d{1,2}[- \/]+19[^:\d]))|((?:^|\D)19[- \/][a-zA-Z]{3}[- \/]\d{1,2}[^:\d])|((?:^|\D)\d{1,2}[- \/]+[a-zA-Z]{3}[- \/]+19[^:\d]))"
| stats count last(sample) as sample by search
Please be aware that this is a very hungry, resource intensive search!
Hope this helps ...
cheers, MuS
UPDATE modifications to the regex and the substr()
uses the first 128 characters of the event.
Thank you so much. No more headaches!