I am having problems parsing XML files into fields that can easily be searched by non-expert users. And I am also having problems posting this question, as I cannot seem to be able to post XML code without this page interpretting it. For the XML tags, I've added spaces so it somewhat looks right. Sorry for not figuring this out too. This is frustrating on so many levels...
General observations - no matter what I put into props.conf and transforms.conf, they do not seem to have any effect on the data sets, with some minor exceptions. Using the "KV_MODE = xml" in props.conf appears to have no effect.
I can manipulate the line breaks in props.conf, but the extracts, reports, or transforms do not seem to work. I say this because no matter what is written into these files, only when I add a " | xmlkv " to a search the fields then show up in the fields list. At this point I cannot figure out how to search on those fields - the SPATH, input, and output variables do not make sense to me and will be too confusing to the end users of the system.
I am trying to extract the fields in the same manner that REGEX in transforms.conf is presenting these fields.
I am using a folded (search head and indexer on same machine) for testing, so the distributed architecture is not a contributing factor to this.
First - the data set -
< record >
< DATE_TIME >2013/06/26 23:47:14.695 < /DATE_TIME >
< CC_NAME/ >
< EVENT_ID >89922919730537392 < /EVENT_ID >
< NODE_ID >PH_SBI_QA < /NODE_ID >
< MSG_ID >CCNS005E < /MSG_ID >
< RET_CODE/ >
< PROC_ID/ >
< PROC_NAME/ >
< SUBMITTER/ >
< REMOTE_NODE/ >
< SHORT_MSG >CCNS005E License obtained. < /SHORT_MSG >
< FILE_SIZE >-1 < /FILE_SIZE >
< ACTIONS_COMPLETED >1372290434696 < /ACTIONS_COMPLETED >
< SOURCE_FILE/ >
< DEST_FILE/ >
< FROM_NODE/ >
< ORIG_NODE/ >
< /record >
The data is displayed as:
< DATE_TIME >2013\/06\/26 23:47:14.695 < /DATE_TIME >
< CC_NAME/ >
< EVENT_ID >89922919730537392 < /EVENT_ID >
< NODE_ID >PH_SBI_QA < /NODE_ID >
< MSG_ID >CCNS005E < /MSG_ID >
< RET_CODE/ >
< PROC_ID/ >
< PROC_NAME/ >
< SUBMITTER/ >
< REMOTE_NODE/ >
< SHORT_MSG >CCNS005E License obtained. < /SHORT_MSG >
< FILE_SIZE >-1 < /FILE_SIZE >
< ACTIONS_COMPLETED >1372290434696 < /ACTIONS_COMPLETED >
< SOURCE_FILE/ >
< DEST_FILE/ >
< FROM_NODE/ >
< ORIG_NODE/ >
< /record >
My props.conf is as follows:
[sourcetype::sterling_events]
LINE_BREAKER = (< /record>\r)
TIME_PREFIX = \DATE_TIME\ >
TIME_FORMAT = \d{4}\d{2}\d{2}\s{2}\d{2}:\d{2}:\d{2}.\d{3}
SHOULD_LINEMERGE = true
KV_MODE = xml
REPORT-eventextract = event-extract
My transforms.conf is as follows:
[event-extract]
REGEX = \ (\S*)\ >(.*?)\
FORMAT = $1::$2
MV_ADD = true
REPEAT_MATCH = true
Using this REGEX in rubular.com with the above data set pulls the key and value from the data, I get the following results:
Match 1
DATE_TIME
2013/06/26 23:47:14.695
Match 2
EVENT_ID
89922919730537392
Match 3
NODE_ID
PH_SBI_QA
Match 4
MSG_ID
CCNS005E
Match 5
SHORT_MSG
CCNS005E License obtained.
Match 6
FILE_SIZE
-1
Match 7
ACTIONS_COMPLETED
1372290434696
When using the " | xmlkv " I see the fields "DATE_TIME", EVENT_ID", etc. Why must is pipe to xmlkv to see these fields?
Thank you in advance...
... View more