I've seen several XML break examples, but none yet which quite matched this issue.
I have an app that writes a file per log in XML format. it begins with the xml header, <?xml and ends with </Job> and these can be very large XML logs.
My props.conf looks like this:
[DumbApp:ActLogs] KV_MODE = xml BREAK_ONLY_BEFPRE = (<?xml) category = Application description = Job Logs from DumbApp disabled = false MAX_EVENTS = 100000 pulldown_type = true SHOULD_LINEMERGE = false
But for every log it breaks after <PercentComplete> and then each XML stanza in that log after that is treated as a different event. I want it not to break at all basically, and treat each log file as one solid event. Every combination of linemerge settings, or setting breakonlybefore to something that would encompass the whole file, beginning to end, has not worked.
I'm running splunk 6.2.3.
Okay, so you've got some config mismatch.
BREAK_ONLY_BEFORE is only applicable if
SHOULD_LINEMERGE=true, which it isn't. When you have that set to
false you want to use
LINE_BREAKER to tell it how to break events. Also, both
LINE_BREAKER take regex expressions, so what you have would not match what you intend. I would start with something like this:
[DumbApp:ActLogs] KV_MODE = xml SHOULD_LINEMERGE = false LINE_BREAKER = ([\r\n]+)(?:<\?xml) MAX_EVENTS = 100000
If these XML segments are one per file, then another option is to set your
LINE_BREAKER to something that will never appear.
You will also want to tell splunk where to find the timestamp in your event. I'm not sure which of the different time values you want to use, so I'll just say you want to set the following:
Finally, depending on how these logs are written, you may run into other issues:
[BATCH://...]inputs. This will ingest the file and delete it when done. Keep in mind it is destructive, so may not fit your use case.
😞 no such luck.. This is a very odd issue. Especially since using the 6.2 add-data tool, it breaks the event perfectly, evn when the event I used to build the sourcetype interactively is the very same event that got indexed!
Couple more questions then:
...| deletein a search, then you're not actually deleting anything, just making it invisible. Actually cleaning an index happens on the command line while splunk is stopped. (
$SPLUNK_HOME/bin/splunk clean eventdata -index $INDEX)
| extract reload=tis for search time extractions and does not apply to index-time configurations. Did you restart Splunk after updating your props.conf?
In short, to test this you need to do the following:
On your indexer:
- Update your props.conf
- stop splunk
splunk clean eventdata from the command line. (This drops EVERYTHING in an index, mind)
- start splunk
On your forwarder:
- stop splunk
- clean fish bucket
- start splunk
I really Appreciate all of your help!
Our environment is rather large. We have eight different indexers across two physical locations. I am setting the configuration entries on the search head but verifying that the bundle replicated to the indexer and that the stanza reflects what is on the search head.
I was doing the delete command so based on that perhaps I was just unhiding the events when they were reindexed. I will try again soon issuing the clean command on the index itself and see if I get different results. Thanks again for everything and I will let you know
Not sure what you're referring to by saying the "bundle replicated" as that can refer to a couple of things, not all of which affect indexing.
On your indexers, run the following:
$SPLUNK_HOME/bin/splunk btool props list $STANZA_NAME
Check the output and make sure it reflects the current state of your props.conf file. I suspect it doesn't.
The delete command simply makes items invisible to search. (Doesn't free up disk space or anything) However, it doesn't prevent the data from being re-indexed. So if you clean your fishbucket on your forwarder, that'll re-ingest the data. My hunch is that your props changes are happening on the search head, which isn't involved in the indexing process at all. When you run searches, a 'search bundle' is pushed to the indexers, but that is just for searching, it isn't involved in the indexing pipeline.
What I would really recommend is to stop testing this live. You're eating up license and disk space to do so. Spin up a local install of splunk, put your props there, restart, and one-shot the file. If it doesn't work, clean the index and try again. Once you have a working stanza, then you put it on your real indexers and start ingesting data.
So I tried both approaches you suggested:
the first with linebreaker to a string that would never ever show up in this data, and the last with the regex you described above. Rinsed and repeated the same effort:
1) purged index
2) cleaned forwarder event data/fishbucket
3) | extract reload=t and /debug/refresh
4) waited for data to be resent
the events still come back in breaking at line 18, , for every event going back on logs with mod times over a year old which definitely aren't being written to any longer.
This almost feels like a bug, because I am definitely breaking other xml with more complexity and haven't experienced this issue.
Thanks for this. Trying it now.
with regard to timestamping, I am going on modtime because the directory I'm pulling from is an archival directory, so the logs will have definitely been completed and moved by the application in order for me to read them. I opted to do this because I was worried about the exact issue you were talking about. 🙂
Try this for your sourcetype definition in props.conf
[DumbApp:ActLogs] KV_MODE = xml category = Application description = Job Logs from DumbApp disabled = false MAX_EVENTS = 100000 pulldown_type = true SHOULD_LINEMERGE=true BREAK_ONLY_BEFORE=^\<\?xml MUST_BREAK_AFTER=\<\/Job\> NO_BINARY_CHECK=true
This is an index time configuration (event breaking). this will reflect in the new data ingested, no on the data that is already ingested. In fact, you can't update already existing data at all.
Sorry, I forgot a step. 🙂 In my situation, I was able to add the can_delete role temporarily to my ID, so I purged the index, and the indexer event data/fishbucket and reindexed it all.
<Job> <Tag>J8675309</Tag> <AppJobID>269</AppJobID> <Name>MEH</Name> <ApplicationName>UNKNOWN</ApplicationName> <ApplicationVersion>0.0</ApplicationVersion> <Site>OUR_SITE</Site> <Project> <P-Survey>FEUD0101</P-Survey> <P-Stage>PROJECT_T</P-Stage> <P-Name>PROJECT001</P-Name> </Project> <UserID>USERID</UserID> <Master>MASTERNODE</Master> <PrintoutFilePath> /PATH/</PrintoutFilePath> <Status>Abend</Status> <PercentDone>0.0</PercentDone> <CollectionDate>2015-08-21 14:59</CollectionDate> <StartDate>2015-08-21 14:57</StartDate> <EndDate>2015-08-21 14:59</EndDate> <ArchiveDate>2015-08-24 23:55</ArchiveDate> <AbortDate>2015-08-21 14:59</AbortDate> <SystemMessage>Job Failed - JOB_FAILED file found</SystemMessage> <UserComments/> <ProjectPriority>8</ProjectPriority> <SequenceGroupName>NOT_USED</SequenceGroupName> <SequenceGroupNumber>0</SequenceGroupNumber> <JobPriority/> <FailedNodeCount>0</FailedNodeCount> <BaseLine>2014.1ext</BaseLine> <Installation>2014.1ext</Installation> <Parameters> <JobServer>COMPUTED</JobServer> <TargetDirectory>/PATH/</TargetDirectory> <PoolOverQuota>NO</PoolOverQuota> <ResourceLoad>light</ResourceLoad> </Parameters> </Job>