I indexed a huge log with data that is going back to 2006. However when I try to search on this data it doesn't show up.
Looked in the splunk error logs and found this error:
01-18-2012 18:40:12.234 -0500 WARN DateParserVerbose - A possible timestamp match (2006-02-13 23:35:03+00) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context="source::/tmp/log/sql_data|host::lalala |OXRSTEST|"
I researched these tags here:
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition
I also did a search for just a date going back to 2006 and I did get results. It appears splunk is taking events from dates in the past and grouping them together under 1 event and giving it a recent date of a few days ago. Here is an example:
1/18/12
6:27:30.000 PM
2006-02-13 23:00:02+00 | www.somesite.net | 0 | 23 | 21 | 25 | 22 | 22 | 21 | 21 | 25 | 0
2006-02-13 23:05:03+00 | www.somesite.net | 0 | 22 | 23 | 22 | 24 | 22 | 21 | 21 | 21 | 0
2006-02-13 23:10:02+00 | www.somesite.net | 0 | 22 | 23 | 22 | 22 | 22 | 21 | 21 | 22 | 0
In this example it counted the "event" as 1 event with a date of 1/18/12 and in the details of the event there are timestamps from 2006. Each timestamp should be a separate event.
If I search for recent data of this nature, each timestamp and numbers following it are listed as a separate event.
So I'm wondering is it a linecount issue, a timestamp issue or both combined?
If I were to change the maxdaysago and maxdayshence tags values in props.conf, is this going to affect other data that is showing up fine at the moment?
Has anyone changed these values in their props.conf and if so, what did you change them to and did you have any problems after changing them? I also am looking at other questions regarding this issue and none really give an idea of what they changed it to. Any ideas?
[OXRSTEST3]
MAX_DAYS_AGO = 2500
SHOULD_LINEMERGE = FALSE
MUST_BREAK_AFTER = "|[^|]$"
BREAK_ONLY_BEFORE = s^d+4-d+-d+s+d+:
MAX_DAY_PREVIOUS was the issue why it didn't work. Changing it to MAX_DAYS_AGO resolved the issue
[OXRSTEST3]
MAX_DAYS_AGO = 2500
SHOULD_LINEMERGE = FALSE
MUST_BREAK_AFTER = "|[^|]$"
BREAK_ONLY_BEFORE = s^d+4-d+-d+s+d+:
MAX_DAY_PREVIOUS was the issue why it didn't work. Changing it to MAX_DAYS_AGO resolved the issue
Changing MAX_DAYS_PREVIOUS and MAX_DAYS_HENCE is OK for one particular sourcetype, source, or host at a time. As long as all the timestamps for the source in question are simple to detect such as this one, you should have no issues.
Do keep in mind that this setting will only be effective for new data only. It won't fix your old data unless you reindex it.
As for the linebreaking, I can't tell if there is a space at the beginning of those lines or not. I will assume there is not.
[your_sourcetype]
MAX_DAYS_PREVIOUS = 2500
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = ^\d+4-\d+\-\d+\s+\d+\:
You can and should always set up a test instance and try out these changes before you go crazy on your production servers.
don't see a max_days_previous on the props.conf config page...this could be the problem
going to try MAX_DAYS_AGO instead
all of the suggestions you gave are in props.conf.
[OXRSTEST3]
MAX_DAYS_PREVIOUS = 2500
SHOULD_LINEMERGE = FALSE
MUST_BREAK_AFTER = "|[^|]$"
BREAK_ONLY_BEFORE = \s^\d+4-\d+-\d+\s+\d+:
Well, i made a separate index for my data. I took out the data from 2006 and put it in a separate log. When this data was reindexed, the data was no longer clumped together. Each line was a separate event. However each event still had a timestamp of today, when the data was indexed, and not of the date that is listed for the event in the file itself
Oh that's right! well my solution is to make a new index for this data, alter the sourcetype slightly and then only search on this data. Also if i have to delete it i can specify the index with only this data in it. I'll keep you posted.
'| delete' doesn't really delete - rather it masks the events from search. If you want to reindex, delete won't help, as Splunk still keeps a history that it has seen your file(s) before. Let us know how your next attempt goes.
I'm going to try this again...
However after giving it some time, it seems worse now. The first result is still 25 lines of data all listed as 1 event and all from 2006. However now I have a bunch of other lines all listed now with all the same date of 2/11/11. ? Perhaps I need to do something else or make it a different sourcetype entirely at this point.
Well first I tried deleting the info by doing sourcetype=OXRSTEST | delete. After this, I added the info to the props.conf and changed the path to the log in question. I modified my inputs.conf as well to reflect this new path to the log that I moved.
After this I did a oneshot command to add the log which was recommended in another thread.
splunk add oneshot -source /opt/splunk/log/sql_data* -sourcetype OXRSTEST
ah this was the answer i was looking for. Yes I have a test box that I'm going to give this a try on. Also I might check if you are right about there being a space at the beginning of the lines. Perhaps that's the difference. I'll test this and see what happens.