- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to onboard a large XML file without breaking it up into multiple events?
I have been asked to onboard large xml files. Each file contains about 105k lines. There is one date in the file. The file MUST not be broken into events. I am having trouble getting the props correct to index the file properly without breaking the file into lots of events. I tried setting max_events to 150000, but I do not think this is working properly. I also tried TRUINCATE=150000, but this is not working.
BTW, these files only come in once a day, so it is not like they are coming in every min or sec.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

If you want all the data in the file to be a single event, you can probably do that better with LINE_BREAKER
Try this in props.conf
[yoursourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=((thismustneverappearinyourfile))
TRUNCATE=0
I don't think that you need to set MAX_EVENTS at all when using this method. But feel free to add in MAX_EVENTS as well...
This technique works by using LINE_BREAKER to define the split between events - and then assigning an "impossible" character string as the line-breaking condition.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mcbradford - I like to upload the file manually and play with the config parameters interactively - saves lots of time ; -)
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Due to the sensitivity of the data, I cannnot share, and due to the size, sanitizing would be a nightmare. I decided to open a case with Splunk since the max_events does not appear to be working properly as documented.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I think the suggestion was not to upload here, but upload manually in your splunk env (probably a test box). So Settings -> Add Data -> Upload. From there you can interactively play with props config to see how Splunk reacts. You may already be doing that, but if not, it's better than waiting every day to see how the latest change you made goes.
Also, don't forget to post the answer out here if Splunk Support solves the problem.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Additional information
If I set:
MAX_EVENTS=25000
I get 5 events, first 4 have 25k line, and the last has 5k lines
If I set:
MAX_EVENTS=100000
I get no events??????
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

what is the entire config stanza?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[baz_voice]
BREAK_ONLY_BEFORE =
DATETIME_CONFIG =
MAX_EVENTS = 100000
MAX_TIMESTAMP_LOOKAHEAD = 300
NO_BINARY_CHECK = true
TRUNCATE = 999999
category = Custom
disabled = false
pulldown_type = true
