Getting Data In

Can anyone give me a hint about why my xml file might not be indexing properly?

sdorich
Communicator

I have an xml file that I've tried to index but have had a very difficult time with it. I just want a new event made after every tag.. In props.conf, I have the following:

[bsm_event_changes]
#KV_MODE=xml
TIME_PREFIX = <time_created>
MAX_TIMESTAMP_LOOKAHEAD = 1000
TRUNCATE = 0
MAX_EVENTS = 40
#BREAK_ONLY_BEFORE = (<event_change type)=([a-zA-Z0-9"-://#=_. ]*)> 
MUST_BREAK_AFTER=</event_change>
BREAK_ONLY_BEFORE_DATE = False
SHOULD_LINEMERGE = True

Note that I have tried setting KV_MODE=xml (but that made everything worse for some reason. I've also tried both my BREAK_ONLY_BEFORE and MUST_BREAK_AFTER lines but neither have done the trick for me. Anyway, with this in props.conf (and yes I've restarted Splunk after editing props.conf), I'm still getting the following in Splunk (basically one big long event that contains many actual events):

xml version="1.0" encoding="UTF-8" standalone="yes"?>fe4f346a-5ae5-4592-9de7-42c46d92e7f36b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:36.190-07:002014-02-14T04:00:36.190-07:00time_received2014-02-14T04:00:17.547-07:002014-02-14T04:00:36.147-07:00duplicate_count295196295197system74012441-1c6e-4ead-8a77-44b33ba191896b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:36.003-07:002014-02-14T04:00:36.003-07:00time_received2014-02-14T04:00:17.153-07:002014-02-14T04:00:35.967-07:00duplicate_count295196295197system8c1e94e0-af74-4978-ab11-ff4231fef9a56b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.583-07:002014-02-14T04:00:17.583-07:00time_received2014-02-14T03:59:58.163-07:002014-02-14T04:00:17.547-07:00duplicate_count295195295196system867acb4f-db2f-4e32-8887-ed8d2c0c7adc6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.190-07:002014-02-14T04:00:17.190-07:00time_received2014-02-14T03:59:58.317-07:002014-02-14T04:00:17.153-07:00duplicate_count295195295196systeme13d2f33-3927-4c4f-be09-3227c21d88ed23330961-5eac-71e3-0dc5-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.067-07:002014-02-14T04:00:17.067-07:00duplicate_count28672868time_received2014-02-14T03:25:18.650-07:002014-02-14T04:00:17.013-07:00systemb20f2dd6-5987-4200-9b73-205f003cc4f414dd5590-950d-71e3-073d-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.020-07:002014-02-14T04:00:17.020-07:00duplicate_count128129time_received2014-02-14T03:55:20.883-07:002014-02-14T04:00:16.980-07:00system85e52a15-6740-4192-b683-4a7c68d685d51e9d7bb1-5eac-71e3-0dc5-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:09.597-07:002014-02-14T04:00:09.597-07:00time_received2014-02-14T03:25:11.467-07:002014-02-14T04:00:09.500-07:00duplicate_count28662867systemdce46f11-dbf7-45f0-ad69-3bacf8258a871072a870-950d-71e3-073d-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:09.503-07:002014-02-14T04:00:09.503-07:00duplicate_count128129time_received2014-02-14T03:55:11.673-07:002014-02-14T04:00:09.473-07:00systeme8c9a5f9-1d92-4ba3-b8ea-0de1aace955e23fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:05.363-07:002014-02-14T04:00:05.363-07:00annotationcede270e-33ab-41da-ad4c-e9f9dbbd0cedAcknowledged by message correlation. KEY: "ADSPI-ResponseTime_Bind_2K8+:slc-srv-bdc2008.glg.local:SLC-SRV-BDC2008:START:DCLDAPBindResponseTime:High". RELATION: "^ADSPI-ResponseTime_Bind_2K8+:slc-srv-bdc2008.glg.local:SLC-SRV-BDC2008:<*>".insertsystem6eb55346-9aa1-4e37-9457-f18d63c64cba23fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:05.317-07:002014-02-14T04:00:05.317-07:00om_userSYSTEMsysteme963b714-7c4a-437d-8fd7-1431fc2eccb623fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.closing.related.events2014-02-14T04:00:05.223-07:002014-02-14T04:00:05.223-07:00stateopenclosedsystemefe4cf2d-6ef0-43c5-a7d5-0b6dc10a8c34fed7ef60-59f9-71e3-1d74-0a6401380000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:00.883-07:002014-02-14T04:00:00.883-07:00duplicate_count18141815time_received2014-02-14T03:00:22.597-07:002014-02-14T04:00:00.840-07:00system61cf3922-823f-4b8d-985f-3661bc2a612431318b40-9511-71e3-1ec3-0a64016d0000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:00.523-07:002014-02-14T04:00:00.523-07:00time_received2014-02-14T03:30:02.940-07:002014-02-14T04:00:00.407-07:00duplicate_count2021systemb67f9326-2619-4951-868d-f454cca211e86b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:58.357-07:002014-02-14T03:59:58.357-07:00duplicate_count295194295195time_received2014-02-14T03:59:40.233-07:002014-02-14T03:59:58.317-07:00systembb3aab0c-fd58-48fa-8df4-80c4d88b02596b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:58.200-07:002014-02-14T03:59:58.200-07:00time_received2014-02-14T03:59:40.427-07:002014-02-14T03:59:58.163-07:00duplicate_count295194295195system75981b0d-7b78-4bf4-b302-14337e8089366b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:40.467-07:002014-02-14T03:59:40.467-07:00duplicate_count295193295194time_received2014-02-14T03:59:21.780-07:002014-02-14T03:59:40.427-07:00system2e933f17-e627-4591-932f-e01e43c37c696b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:40.277-07:002014-02-14T03:59:40.277-07:00duplicate_count295193295194time_received2014-02-14T03:59:21.913-07:002014-02-14T03:59:40.233-07:00systemd0f5bed3-1958-4468-9163-18368bd2856e6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:21.950-07:002014-02-14T03:59:21.950-07:00duplicate_count295192295193time_received2014-02-14T03:59:04.167-07:002014-02-14T03:59:21.913-07:00system0e650f27-58ba-46fc-ba5f-3139bf1fabba6b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:21.810-07:002014-02-14T03:59:21.810-07:00duplicate_count295192295193time_received2014-02-14T03:59:04.013-07:002014-02-14T03:59:21.780-07:00system09339fae-97b6-4b09-bdff-2383ddf658af6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:04.200-07:002014-02-14T03:59:04.200-07:00time_received2014-02-14T03:58:46.760-07:002014-02-14T03:59:04.167-07:00duplicate_count295191295192system

Tags (2)
1 Solution

lguinn2
Legend

The following is better. Note that < is special character in regular expressions, so that is one reason why it might not have worked.

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
MUST_BREAK_AFTER=\</event_change\>
SHOULD_LINEMERGE = True

However, you have another, more serious problem. When you use MUST_BREAK_AFTER or BREAK_ONLY_BEFORE, Splunk breaks on the line boundary, not in the middle of the line. It looks like your events should break in the middle of the line. So the following may work better for you (I haven't tried this, so I am not sure):

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
LINE_BREAKER=\</event_change\>(.*?)\<event_change\>
SHOULD_LINEMERGE = False

Also, FWIW, MAX_TIMESTAMP_LOOKAHEAD counts from the TIME_PREFIX, not the beginning of the event, so I cut it back to a more reasonable size.

Finally - you may already know this, but just a reminder: when you update props.conf, the new parsing rules will apply only to new data as it is received. Existing data will not be changed. So you might want to delete the old data or clean the index...

View solution in original post

lguinn2
Legend

The following is better. Note that < is special character in regular expressions, so that is one reason why it might not have worked.

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
MUST_BREAK_AFTER=\</event_change\>
SHOULD_LINEMERGE = True

However, you have another, more serious problem. When you use MUST_BREAK_AFTER or BREAK_ONLY_BEFORE, Splunk breaks on the line boundary, not in the middle of the line. It looks like your events should break in the middle of the line. So the following may work better for you (I haven't tried this, so I am not sure):

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
LINE_BREAKER=\</event_change\>(.*?)\<event_change\>
SHOULD_LINEMERGE = False

Also, FWIW, MAX_TIMESTAMP_LOOKAHEAD counts from the TIME_PREFIX, not the beginning of the event, so I cut it back to a more reasonable size.

Finally - you may already know this, but just a reminder: when you update props.conf, the new parsing rules will apply only to new data as it is received. Existing data will not be changed. So you might want to delete the old data or clean the index...

sdorich
Communicator

Actually, I think I figured out my error! thanks!

0 Karma

sdorich
Communicator

Thanks. I didn't know that < is a special character so that helps. However, I'm still getting the same output. I have been cleaning out the index that is designated for these events every time I try something different in props.conf so it's not that I'm just seeing "old" events. But can you please explain with examples maybe what the difference b/w LINE_BREAKER and MUST_BREAK_AFTER? I'm just confused about why I couldn't use MUST_BREAK_AFTER=</event_change> to tell Splunk that I want a new event starting after the keyword "".

0 Karma

bmacias84
Champion

try this regex ^\s+

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Ready to make your IT operations smarter and more efficient? Discover how to automate Splunk alerts with Red ...