I am new to Splunk and am trying to figure out how to parse a xml file. This is a generic xml file coming from Microsoft Storage Reports
The XML
<?xml version="1.0"?><StorageReport version="2.0"><ReportHeader ReportTitle="Quota Usage Report" GeneratedAt="1/30/2014 1:05:06 AM" MachineName="Server1" ReportTypeDescription="Lists the quotas that exceed a certain disk space usage level. Use this report to quickly identify quotas that may soon be exceeded so that you can take the appropriate action." ReportDescription="" TimestampRenderingPhaseStart="1/30/2014 1:05:08 AM" TimestampGenerationEnding="1/30/2014 1:05:08 AM" MaxDisplayItems="1000" Type="FsrmReportType_QuotaUsage" FilesCount="189" IsStandardFileList="False"><ReportNamespaces><Namespace ID="0">K:\</Namespace></ReportNamespaces><ReportFilters><Filter Name="Minimum Quota used percent" Value="0%" /></ReportFilters><ReportWarnings /></ReportHeader><ReportData><Item><Folder>k:\fs_Folder1</Folder><FolderURL>\\Server1\k$\fs_Folder1</FolderURL><RemotePath><Path>\\Server1\STORE1$\fs_Folder1</Path></RemotePath><Owner>BUILTIN\Administrators</Owner><Limit>1073741824</Limit><Used>4598678528</Used><PercentUsed>428.29</PercentUsed><PeakUsage>4602409984</PeakUsage><PeakUsageTime>1/29/2014 11:49:57 AM</PeakUsageTime><Description></Description></Item><Item><Folder>k:\fs_Folder2</Folder><FolderURL>\\Server1\k$\fs_Folder2</FolderURL><RemotePath><Path>\\Server1\STORE1$\fs_Folder2</Path></RemotePath><Owner>BUILTIN\Administrators</Owner><Limit>1073741824</Limit><Used>3881845760</Used><PercentUsed>361.53</PercentUsed><PeakUsage>3922589696</PeakUsage><PeakUsageTime>1/28/2014 3:41:41 PM</PeakUsageTime><Description></Description></Item></ReportData><ReportSummary><ReportTotals QuotaCount="189" Used="94984972288" /><NamespaceTotals QuotaCount="189" Used="94984972288" /></ReportSummary></StorageReport>
The Props.conf
SHOULD_LINEMERGE = true
KV_MODE=xml
BREAK_ONLY_BEFORE=<Item
LINE_BREAK =<Item
NO_BINARY_CHECK =1
TRUNCATE=100000000
MV_ADD = true
DATETIME_CONFIG = CURRENT
I will not mark myself as having the right answer because it seems self serving. But for anyone who is looking to parse Microsoft's Storage Reports.
BREAK_ONLY_BEFORE_DATE = true
KV_MODE = XML
LINE_BREAKER = (<item>)
MUST_NOT_BREAK_AFTER = (ReportData/>)
MUST_NOT_BREAK_BEFORE = (<ReportData)
NO_BINARY_CHECK = 1 SHOULD_LINEMERGE = false
TIME_PREFIX = (<PeakUsag)
pulldown_type = 1
I will not mark myself as having the right answer because it seems self serving. But for anyone who is looking to parse Microsoft's Storage Reports.
BREAK_ONLY_BEFORE_DATE = true
KV_MODE = XML
LINE_BREAKER = (<item>)
MUST_NOT_BREAK_AFTER = (ReportData/>)
MUST_NOT_BREAK_BEFORE = (<ReportData)
NO_BINARY_CHECK = 1 SHOULD_LINEMERGE = false
TIME_PREFIX = (<PeakUsag)
pulldown_type = 1
You should mark the answer correct. Don't worry about self-serving - it helps the community if you answer your own question AND make it correct!
1 - people quit checking to see if this question needs to be answered
2 - people who have similar questions can easily see that this question has an answer
A copy of things:
The <
is a special character in regular expressions, so you should specify it as \<
Also, you should not set both LINE_BREAK
and BREAK_ONLY_BEFORE
. I suggest that you remove the line for LINE_BREAK
Otherwise, it should work.
I have figured it out.
BREAK_ONLY_BEFORE_DATE = true
KV_MODE = XML
LINE_BREAKER = (
MUST_NOT_BREAK_AFTER = (ReportData\/>)
MUST_NOT_BREAK_BEFORE = (<ReportData)
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_PREFIX = (<PeakUsag)
pulldown_type = 1