Getting Data In

Can anyone give me a hint about why my xml file might not be indexing properly?

sdorich
Communicator

I have an xml file that I've tried to index but have had a very difficult time with it. I just want a new event made after every tag.. In props.conf, I have the following:

[bsm_event_changes]
#KV_MODE=xml
TIME_PREFIX = <time_created>
MAX_TIMESTAMP_LOOKAHEAD = 1000
TRUNCATE = 0
MAX_EVENTS = 40
#BREAK_ONLY_BEFORE = (<event_change type)=([a-zA-Z0-9"-://#=_. ]*)> 
MUST_BREAK_AFTER=</event_change>
BREAK_ONLY_BEFORE_DATE = False
SHOULD_LINEMERGE = True

Note that I have tried setting KV_MODE=xml (but that made everything worse for some reason. I've also tried both my BREAK_ONLY_BEFORE and MUST_BREAK_AFTER lines but neither have done the trick for me. Anyway, with this in props.conf (and yes I've restarted Splunk after editing props.conf), I'm still getting the following in Splunk (basically one big long event that contains many actual events):

xml version="1.0" encoding="UTF-8" standalone="yes"?>fe4f346a-5ae5-4592-9de7-42c46d92e7f36b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:36.190-07:002014-02-14T04:00:36.190-07:00time_received2014-02-14T04:00:17.547-07:002014-02-14T04:00:36.147-07:00duplicate_count295196295197system74012441-1c6e-4ead-8a77-44b33ba191896b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:36.003-07:002014-02-14T04:00:36.003-07:00time_received2014-02-14T04:00:17.153-07:002014-02-14T04:00:35.967-07:00duplicate_count295196295197system8c1e94e0-af74-4978-ab11-ff4231fef9a56b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.583-07:002014-02-14T04:00:17.583-07:00time_received2014-02-14T03:59:58.163-07:002014-02-14T04:00:17.547-07:00duplicate_count295195295196system867acb4f-db2f-4e32-8887-ed8d2c0c7adc6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.190-07:002014-02-14T04:00:17.190-07:00time_received2014-02-14T03:59:58.317-07:002014-02-14T04:00:17.153-07:00duplicate_count295195295196systeme13d2f33-3927-4c4f-be09-3227c21d88ed23330961-5eac-71e3-0dc5-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.067-07:002014-02-14T04:00:17.067-07:00duplicate_count28672868time_received2014-02-14T03:25:18.650-07:002014-02-14T04:00:17.013-07:00systemb20f2dd6-5987-4200-9b73-205f003cc4f414dd5590-950d-71e3-073d-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:17.020-07:002014-02-14T04:00:17.020-07:00duplicate_count128129time_received2014-02-14T03:55:20.883-07:002014-02-14T04:00:16.980-07:00system85e52a15-6740-4192-b683-4a7c68d685d51e9d7bb1-5eac-71e3-0dc5-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:09.597-07:002014-02-14T04:00:09.597-07:00time_received2014-02-14T03:25:11.467-07:002014-02-14T04:00:09.500-07:00duplicate_count28662867systemdce46f11-dbf7-45f0-ad69-3bacf8258a871072a870-950d-71e3-073d-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:09.503-07:002014-02-14T04:00:09.503-07:00duplicate_count128129time_received2014-02-14T03:55:11.673-07:002014-02-14T04:00:09.473-07:00systeme8c9a5f9-1d92-4ba3-b8ea-0de1aace955e23fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:05.363-07:002014-02-14T04:00:05.363-07:00annotationcede270e-33ab-41da-ad4c-e9f9dbbd0cedAcknowledged by message correlation. KEY: "ADSPI-ResponseTime_Bind_2K8+:slc-srv-bdc2008.glg.local:SLC-SRV-BDC2008:START:DCLDAPBindResponseTime:High". RELATION: "^ADSPI-ResponseTime_Bind_2K8+:slc-srv-bdc2008.glg.local:SLC-SRV-BDC2008:<*>".insertsystem6eb55346-9aa1-4e37-9457-f18d63c64cba23fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:05.317-07:002014-02-14T04:00:05.317-07:00om_userSYSTEMsysteme963b714-7c4a-437d-8fd7-1431fc2eccb623fefb20-9542-71e3-1d1f-0a6401300000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.closing.related.events2014-02-14T04:00:05.223-07:002014-02-14T04:00:05.223-07:00stateopenclosedsystemefe4cf2d-6ef0-43c5-a7d5-0b6dc10a8c34fed7ef60-59f9-71e3-1d74-0a6401380000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:00.883-07:002014-02-14T04:00:00.883-07:00duplicate_count18141815time_received2014-02-14T03:00:22.597-07:002014-02-14T04:00:00.840-07:00system61cf3922-823f-4b8d-985f-3661bc2a612431318b40-9511-71e3-1ec3-0a64016d0000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T04:00:00.523-07:002014-02-14T04:00:00.523-07:00time_received2014-02-14T03:30:02.940-07:002014-02-14T04:00:00.407-07:00duplicate_count2021systemb67f9326-2619-4951-868d-f454cca211e86b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:58.357-07:002014-02-14T03:59:58.357-07:00duplicate_count295194295195time_received2014-02-14T03:59:40.233-07:002014-02-14T03:59:58.317-07:00systembb3aab0c-fd58-48fa-8df4-80c4d88b02596b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:58.200-07:002014-02-14T03:59:58.200-07:00time_received2014-02-14T03:59:40.427-07:002014-02-14T03:59:58.163-07:00duplicate_count295194295195system75981b0d-7b78-4bf4-b302-14337e8089366b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:40.467-07:002014-02-14T03:59:40.467-07:00duplicate_count295193295194time_received2014-02-14T03:59:21.780-07:002014-02-14T03:59:40.427-07:00system2e933f17-e627-4591-932f-e01e43c37c696b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:40.277-07:002014-02-14T03:59:40.277-07:00duplicate_count295193295194time_received2014-02-14T03:59:21.913-07:002014-02-14T03:59:40.233-07:00systemd0f5bed3-1958-4468-9163-18368bd2856e6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:21.950-07:002014-02-14T03:59:21.950-07:00duplicate_count295192295193time_received2014-02-14T03:59:04.167-07:002014-02-14T03:59:21.913-07:00system0e650f27-58ba-46fc-ba5f-3139bf1fabba6b6f56c0-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:21.810-07:002014-02-14T03:59:21.810-07:00duplicate_count295192295193time_received2014-02-14T03:59:04.013-07:002014-02-14T03:59:21.780-07:00system09339fae-97b6-4b09-bdff-2383ddf658af6b899580-6410-71e3-0d32-0a6401460000urn:x-hp:2009:software:data_model:opr:type:eventhistorylines.component.event.synchronization2014-02-14T03:59:04.200-07:002014-02-14T03:59:04.200-07:00time_received2014-02-14T03:58:46.760-07:002014-02-14T03:59:04.167-07:00duplicate_count295191295192system

Tags (2)
1 Solution

lguinn2
Legend

The following is better. Note that < is special character in regular expressions, so that is one reason why it might not have worked.

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
MUST_BREAK_AFTER=\</event_change\>
SHOULD_LINEMERGE = True

However, you have another, more serious problem. When you use MUST_BREAK_AFTER or BREAK_ONLY_BEFORE, Splunk breaks on the line boundary, not in the middle of the line. It looks like your events should break in the middle of the line. So the following may work better for you (I haven't tried this, so I am not sure):

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
LINE_BREAKER=\</event_change\>(.*?)\<event_change\>
SHOULD_LINEMERGE = False

Also, FWIW, MAX_TIMESTAMP_LOOKAHEAD counts from the TIME_PREFIX, not the beginning of the event, so I cut it back to a more reasonable size.

Finally - you may already know this, but just a reminder: when you update props.conf, the new parsing rules will apply only to new data as it is received. Existing data will not be changed. So you might want to delete the old data or clean the index...

View solution in original post

lguinn2
Legend

The following is better. Note that < is special character in regular expressions, so that is one reason why it might not have worked.

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
MUST_BREAK_AFTER=\</event_change\>
SHOULD_LINEMERGE = True

However, you have another, more serious problem. When you use MUST_BREAK_AFTER or BREAK_ONLY_BEFORE, Splunk breaks on the line boundary, not in the middle of the line. It looks like your events should break in the middle of the line. So the following may work better for you (I haven't tried this, so I am not sure):

[bsm_event_changes]
TIME_PREFIX = \<time_created\>
MAX_TIMESTAMP_LOOKAHEAD = 100
TRUNCATE = 0
MAX_EVENTS = 40
LINE_BREAKER=\</event_change\>(.*?)\<event_change\>
SHOULD_LINEMERGE = False

Also, FWIW, MAX_TIMESTAMP_LOOKAHEAD counts from the TIME_PREFIX, not the beginning of the event, so I cut it back to a more reasonable size.

Finally - you may already know this, but just a reminder: when you update props.conf, the new parsing rules will apply only to new data as it is received. Existing data will not be changed. So you might want to delete the old data or clean the index...

sdorich
Communicator

Actually, I think I figured out my error! thanks!

0 Karma

sdorich
Communicator

Thanks. I didn't know that < is a special character so that helps. However, I'm still getting the same output. I have been cleaning out the index that is designated for these events every time I try something different in props.conf so it's not that I'm just seeing "old" events. But can you please explain with examples maybe what the difference b/w LINE_BREAKER and MUST_BREAK_AFTER? I'm just confused about why I couldn't use MUST_BREAK_AFTER=</event_change> to tell Splunk that I want a new event starting after the keyword "".

0 Karma

bmacias84
Champion

try this regex ^\s+

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...