Getting Data In

Can anyone give me a hint about why my xml file might not be indexing properly?

sdorich
Communicator

I have an xml file that I've tried to index but have had a very difficult time with it. I just want a new event made after every tag.. In props.conf, I have the following:

[bsm_event_changes]
#KV_MODE=xml
TIME_PREFIX = <time_created>
MAX_TIMESTAMP_LOOKAHEAD = 1000
TRUNCATE = 0
MAX_EVENTS = 40
#BREAK_ONLY_BEFORE = (<event_change type)=([a-zA-Z0-9"-://#=_. ]*)> 
MUST_BREAK_AFTER=</event_change>
BREAK_ONLY_BEFORE_DATE = False
SHOULD_LINEMERGE = True

Note that I have tried setting KV_MODE=xml (but that made everything worse for some reason. I've also tried both my BREAK_ONLY_BEFORE and MUST_BREAK_AFTER lines but neither have done the trick for me. Anyway, with this in props.conf (and yes I've restarted Splunk after editing props.conf), I'm still getting the following in Splunk (basically one big long event that contains many actual events):

xml version="1.0" encoding="UTF-8" standalone="yes"?>fe4f346a-5ae5-4592-9de7-42c46d92e7f36b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:36.190-07:002014-02-14T04:00:36.190-07:00time_received2014-02-14T04:00:17.547-07:002014-02-14T04:00:36.147-07:00duplicate_count295196295197system74012441-1c6e-4ead-8a77-44b33ba191896b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:36.003-07:002014-02-14T04:00:36.003-07:00time_received2014-02-14T04:00:17.153-07:002014-02-14T04:00:35.967-07:00duplicate_count295196295197system8c1e94e0-af74-4978-ab11-ff4231fef9a56b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.583-07:002014-02-14T04:00:17.583-07:00time_received2014-02-14T03:59:58.163-07:002014-02-14T04:00:17.547-07:00duplicate_count295195295196system867acb4f-db2f-4e32-8887-ed8d2c0c7adc6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.190-07:002014-02-14T04:00:17.190-07:00time_received2014-02-14T03:59:58.317-07:002014-02-14T04:00:17.153-07:00duplicate_count295195295196systeme13d2f33-3927-4c4f-be09-3227c21d88ed23330961-5eac-71e3-0dc5-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.067-07:002014-02-14T04:00:17.067-07:00duplicate_count28672868time_received2014-02-14T03:25:18.650-07:002014-02-14T04:00:17.013-07:00systemb20f2dd6-5987-4200-9b73-205f003cc4f414dd5590-950d-71e3-073d-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.020-07:002014-02-14T04:00:17.020-07:00duplicate_count128129time_received2014-02-14T03:55:20.883-07:002014-02-14T04:00:16.980-07:00system85e52a15-6740-4192-b683-4a7c68d685d51e9d7bb1-5eac-71e3-0dc5-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:09.597-07:002014-02-14T04:00:09.597-07:00time_received2014-02-14T03:25:11.467-07:002014-02-14T04:00:09.500-07:00duplicate_count28662867systemdce46f11-dbf7-45f0-ad69-3bacf8258a871072a870-950d-71e3-073d-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:09.503-07:002014-02-14T04:00:09.503-07:00duplicate_count128129time_received2014-02-14T03:55:11.673-07:002014-02-14T04:00:09.473-07:00systeme8c9a5f9-1d92-4ba3-b8ea-0de1aace955e23fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:05.363-07:002014-02-14T04:00:05.363-07:00annotationcede270e-33ab-41da-ad4c-e9f9dbbd0cedAcknowledged by message correlation. KEY: "ADSPI-ResponseTime_Bind_2K8+:slc-srv-bdc2008.glg.local:SLC-SRV-BDC2008:START:DCLDAPBindResponseTime:High". RELATION: "^ADSPI-ResponseTime_Bind_2K8+:slc-srv-bdc2008.glg.local:SLC-SRV-BDC2008:<*>".insertsystem6eb55346-9aa1-4e37-9457-f18d63c64cba23fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:05.317-07:002014-02-14T04:00:05.317-07:00om_userSYSTEMsysteme963b714-7c4a-437d-8fd7-1431fc2eccb623fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.closing.related.events2014-02-14T04:00:05.223-07:002014-02-14T04:00:05.223-07:00stateopenclosedsystemefe4cf2d-6ef0-43c5-a7d5-0b6dc10a8c34fed7ef60-59f9-71e3-1d74-0a6401380000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:00.883-07:002014-02-14T04:00:00.883-07:00duplicate_count18141815time_received2014-02-14T03:00:22.597-07:002014-02-14T04:00:00.840-07:00system61cf3922-823f-4b8d-985f-3661bc2a612431318b40-9511-71e3-1ec3-0a64016d0000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:00.523-07:002014-02-14T04:00:00.523-07:00time_received2014-02-14T03:30:02.940-07:002014-02-14T04:00:00.407-07:00duplicate_count2021systemb67f9326-2619-4951-868d-f454cca211e86b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:58.357-07:002014-02-14T03:59:58.357-07:00duplicate_count295194295195time_received2014-02-14T03:59:40.233-07:002014-02-14T03:59:58.317-07:00systembb3aab0c-fd58-48fa-8df4-80c4d88b02596b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:58.200-07:002014-02-14T03:59:58.200-07:00time_received2014-02-14T03:59:40.427-07:002014-02-14T03:59:58.163-07:00duplicate_count295194295195system75981b0d-7b78-4bf4-b302-14337e8089366b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:40.467-07:002014-02-14T03:59:40.467-07:00duplicate_count295193295194time_received2014-02-14T03:59:21.780-07:002014-02-14T03:59:40.427-07:00system2e933f17-e627-4591-932f-e01e43c37c696b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:40.277-07:002014-02-14T03:59:40.277-07:00duplicate_count295193295194time_received2014-02-14T03:59:21.913-07:002014-02-14T03:59:40.233-07:00systemd0f5bed3-1958-4468-9163-18368bd2856e6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:21.950-07:002014-02-14T03:59:21.950-07:00duplicate_count295192295193time_received2014-02-14T03:59:04.167-07:002014-02-14T03:59:21.913-07:00system0e650f27-58ba-46fc-ba5f-3139bf1fabba6b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:21.810-07:002014-02-14T03:59:21.810-07:00duplicate_count295192295193time_received2014-02-14T03:59:04.013-07:002014-02-14T03:59:21.780-07:00system09339fae-97b6-4b09-bdff-2383ddf658af6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:04.200-07:002014-02-14T03:59:04.200-07:00time_received2014-02-14T03:58:46.760-07:002014-02-14T03:59:04.167-07:00duplicate_count295191295192system

Tags (2)
1 Solution

lguinn2
Legend

The following is better. Note that < is special character in regular expressions, so that is one reason why it might not have worked.

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
MUST_BREAK_AFTER=\</event_change\>
SHOULD_LINEMERGE = True

However, you have another, more serious problem. When you use MUST_BREAK_AFTER or BREAK_ONLY_BEFORE, Splunk breaks on the line boundary, not in the middle of the line. It looks like your events should break in the middle of the line. So the following may work better for you (I haven't tried this, so I am not sure):

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
LINE_BREAKER=\</event_change\>(.*?)\<event_change\>
SHOULD_LINEMERGE = False

Also, FWIW, MAX_TIMESTAMP_LOOKAHEAD counts from the TIME_PREFIX, not the beginning of the event, so I cut it back to a more reasonable size.

Finally - you may already know this, but just a reminder: when you update props.conf, the new parsing rules will apply only to new data as it is received. Existing data will not be changed. So you might want to delete the old data or clean the index...

View solution in original post

lguinn2
Legend

The following is better. Note that < is special character in regular expressions, so that is one reason why it might not have worked.

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
MUST_BREAK_AFTER=\</event_change\>
SHOULD_LINEMERGE = True

However, you have another, more serious problem. When you use MUST_BREAK_AFTER or BREAK_ONLY_BEFORE, Splunk breaks on the line boundary, not in the middle of the line. It looks like your events should break in the middle of the line. So the following may work better for you (I haven't tried this, so I am not sure):

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
LINE_BREAKER=\</event_change\>(.*?)\<event_change\>
SHOULD_LINEMERGE = False

Also, FWIW, MAX_TIMESTAMP_LOOKAHEAD counts from the TIME_PREFIX, not the beginning of the event, so I cut it back to a more reasonable size.

Finally - you may already know this, but just a reminder: when you update props.conf, the new parsing rules will apply only to new data as it is received. Existing data will not be changed. So you might want to delete the old data or clean the index...

sdorich
Communicator

Actually, I think I figured out my error! thanks!

0 Karma

sdorich
Communicator

Thanks. I didn't know that < is a special character so that helps. However, I'm still getting the same output. I have been cleaning out the index that is designated for these events every time I try something different in props.conf so it's not that I'm just seeing "old" events. But can you please explain with examples maybe what the difference b/w LINE_BREAKER and MUST_BREAK_AFTER? I'm just confused about why I couldn't use MUST_BREAK_AFTER=</event_change> to tell Splunk that I want a new event starting after the keyword "".

0 Karma

bmacias84
Champion

try this regex ^\s+

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...