I have some HL7 Version 3 records (XML) that I want to parse, but I'm having trouble figuring out how to do it. Some of the information is in a table format inside the record sort of like this (taken from a sample file):
<component>
<section>
<templateId root="2.16.840.1.113883.10.20.22.2.10"/>
<!-- **** Plan of Care section template **** -->
<code code="18776-5" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" displayName="Treatment plan"/>
<title>Plan of Care</title>
<text>
<table border="1" width="100%">
<thead>
<tr>
<th>Planned Activity</th>
<th>Planned Date</th>
</tr>
</thead>
<tbody>
<tr>
<td>Colonoscopy</td>
<td>April 21, 2000</td>
</tr>
</tbody>
</table>
</text>
I'd like to get the Plan of Care info and display it. Any way to do that?
I've tried using xpath, but it seems to not work at all.
I found the problem I was having with xpath...turns out that one of the items in the header was preventing xpath from working;
<ClinicalDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:hl7-org:v3 http://xreg2.nist.gov:8080/hitspValidation/schema/cdar2c32/infrastructure/cda/C32_CDA.xsd"; xmlns="urn:hl7-org:v3" xmlns:mif="urn:hl7-org:v3/mif">
The next to last item (xmlns="urn:hl7-org:v3") caused the problem. If I take that out, xpath works as advertised.
This fakes your data:
|makeresults|eval _raw="
<component>
<section>
<templateId root=\"2.16.840.1.113883.10.20.22.2.10\"/>
<!-- **** Plan of Care section template **** -->
<code code=\"18776-5\" codeSystem=\"2.16.840.1.113883.6.1\" codeSystemName=\"LOINC\" displayName=\"Treatment plan\"/>
<title>Plan of Care</title>
<text>
<table border=\"1\" width=\"100%\">
<thead>
<tr>
<th>Planned Activity</th>
<th>Planned Date</th>
</tr>
</thead>
<tbody>
<tr>
<td>Colonoscopy</td>
<td>April 21, 2000</td>
</tr>
</tbody>
</table>
</text>
</section>
</component>"
| fields _time _raw
This is your solution:
| streamstats count AS _serial
| xpath "//component/section/text/table/thead/tr" outfield=thead
| xpath "//component/section/text/table/tbody/tr" outfield=tbody
| rex max_match=0 field=thead "(?ms)<th>(?<kvp_keys>[^\r\n]+)</th>"
| rex max_match=0 field=tbody "(?ms)<td>(?<kvp_values>[^\r\n]+)</td>"
| eval KVP=mvzip(kvp_keys, kvp_values, ":=:")
| fields _time KVP _serial
| mvexpand KVP
| rex field=KVP "^(?<kvp_key>.*):=:(?<kvp_value>.*)$"
| eval {kvp_key}=kvp_value
| fields - KVP _raw kvp_key kvp_value
| stats first(_time) AS _time values(*) AS * BY _serial
Using xpath
is the way to go, I would start by stripping off layers and retrying. Also, you can extract fields from it on the way in using INDEXED_EXTRACTIONS=XML
on the forwarder's props.conf, which I would definitely try.
Is INDEXED_EXTRACTIONS=XML setting still valid? I got an error for this setting in 7.0
I have tried those things, and I'm getting the same results. xpath seems to be failing silently, I haven't seen anything in the logs (even at DEBBUG).
xpath is simply not working, even on tiny test files. Does anyone have any pointers for me, on where i can look for config issues or something?