- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Re-labeling and breaking up xml stream?
Hello,
I have a tcp stream incoming with xml Call Data Records (CDR). enclosed at the end is an example.
The CDR contains information about the caller and destination phone. There are several SEDCMDs in the props.conf to take lines like <party type="orig"... and convert them into <orig_party...
The problem is with lines whose only differentiation is their position in the data structure. With the xml lines that start with "<RTCPstats> I need to modify their fields: I need the first line to be ingested as "<orig_PS>31367</orig_PS> <orig_OS>6273400</orig_OS>..." and the second line as "<term_PS>31366</term_PS>, <term_OS>6273200</term_OS>...".
<flowinfo> <RTCPstats>PS=31367, OS=6273400,..> </flowinfo>
<flowinfo> <RTCPstats>PS=31366, OS=6273200,...> </flowinfo>
The actual sed command "sed -r -e '0,/ ?([A-Z_]*)=([0-9]*)/s//<ORIG_\1>\2<\/ORIG_\1>/g'" will do this, but the same entry in the SEDCMD will not.
Altering a line of input is easy: altering only the FIRST instance in a record with embedded newlines is not.
What are my options?
- SEDCMD in props.conf?
- strip out all newlines so sedcmd treats it all as one? (can't really do it w. sed...)
- regex in transforms.conf?
- pass input through a script that CAN do this?
Some details:
- The linebreaker is two carriage returns (literally \r\r, or 0d0d).
- There are embedded newlines in each record so any SEDCMD will be applied to each line, not the entire record all at once.
- I get 60,000 records per minute: this transformation needs to be fast.
Sample record: (spaced out neatly)
<e>
<st>Perimeta XML CDR</st>
<h>the perimeta hostname</h>
<t>1664838107186</t>
<tid>2814754955435820</tid>
<sid>2082</sid>
<eid>CDR</eid>
<call starttime="1664837476918" starttime_local="2022-10-03T15:51:16-0700" endtime="1664838107179" endtime_local="2022-10-03T16:01:47-0700" duration="630261" release_side="term" bcid="a big string of numbers and letters">
<party type="orig" phone="caller phonenumber" domain="ipaddr1" sig_address="ipaddr1" sig_port="5060" sig_transport="udp" trunk_group="trunkgroupname" trunk_context="nap" sip_call_id="anumber@adestination"/>
<party type="term" phone="destination phonenumber" domain="0.0.0.0" routing_number="a routing number" sig_address="<an ip addr>" sig_port="5060" sig_transport="udp" trunk_group="6444" trunk_context="itg" edit_trunk_group="" edit_trunk_context="" sip_call_id="adifferentnumber@adifferentdestination"/>
<adjacency type="orig" name="orig_adjacency_system" account="" vpnid="0X00000001" mediarealm="CoreMedia1"/>
<adjacency type="term" name="dest_adjacency_system" account="" vpnid="0X00000004" mediarealm="CoreMedia1"/>
<category name="cat.sbc.redirected"/>
<connect time="1664837483144" time_local="2022-10-03T15:51:23-0700"/>
<firstendrequest time="1664838107158" time_local="2022-10-03T16:01:47-0700"/>
<disconnect time="1664838107179" time_local="2022-10-03T16:01:47-0700" reason="0"/>
<redirector bcid="another string of letters and numbers" editphone="a phone number"/>
<post_dial_delay duration="2895"/>
<QoS stream_id="1" instance="0" reservetime="1664837476918" reservetime_local="2022-10-03T15:51:16-0700" committime="1664837483144" committime_local="2022-10-03T15:51:23-0700" releasetime="1664838107184" releasetime_local="2022-10-03T16:01:47-0700">
<gate>
<flowinfo>
<local address="an ip address" port="63130"/>
<remote address="another ip address" port="36214"/>
<sd>m=audio 0 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=ptime:20
</sd>
<RTCPstats>PS=31367, OS=6273400, PR=31366, OR=6273200, PD=0, OD=0, PL=0, JI=0, TOS=0, TOR=0, LA=0, PC/RPS=31165, PC/ROS=4986400, PC/RPR=31367, PC/RPL=0, PC/RJI=0, PC/RLA=0, RF=91, MOS=43, PC/RAJ=0, PC/RML=0</RTCPstats>
</flowinfo>
<flowinfo>
<local address="an ip address" port="19648"/>
<remote address="a diffrent ip address" port="26046"/>
<sd>m=audio 0 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=ptime:20
</sd>
<RTCPstats>PS=31366, OS=6273200, PR=31367, OR=6273400, PD=0, OD=0, PL=0, JI=0, TOS=0, TOR=0, LA=0, PC/RPS=0, PC/ROS=0, PC/RPR=0, PC/RPL=0, PC/RJI=0, PC/RLA=0, RF=82, MOS=41, PC/RAJ=0, PC/RML=0</RTCPstats>
</flowinfo>
</gate>
</QoS>
</call>
</e>
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


To change only the first match, remove the 'g' flag from the end of the sed command.
If this reply helps you, Karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have tried that, but the problem is that the sedcmd is still line oriented on newlines: thus the second line appears to be a separate record and still gets changed.
--jason
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


SEDCMD is event-oriented rather than line-oriented. That you see it changing each line implies one line is one event. Either the line breaking settings should be changed or another means found to change the data, such as INGEST_EVAL
If this reply helps you, Karma would be appreciated.
