Getting Data In

Issues with XML Linebreaker

avoelk
Communicator

Hello fellow splunkers,

right now I'm working through the 7 labs for SE II which are necessary to be able to start the finishing accreditation quiz. I've been able to finish 5 of them by now but am totally lost with lab 6. here the instructions are:

- events should begin with <Interceptor> and end with </Interceptor> (so Linebreaking is needed)

- Extract (at search time) all fields and values in between the Interceptor lines and throw away any of the header lines before the first <Interceptor> and the line after the very last </Interceptor>

- Use the ActionDate and ActionTime field as the timestamp

- have Splunk auto extract the fields and values

 

how they say I'd know I've done it:

- I'll have x amount of events and the fields broken out using SPATH notation

- the correct timestamp

- no text before the first and after the last Interceptor

 

What I have so far:

- I'm able to extract ActionDate and ActionTime to create a new timestamp

- I'm able to linebreak with LINE_BREAK = \<Interceptor\>()

 

My Issue:

- When I linebreak I save the new sourcetype and try to proceed to alter it given the other things to do like extract timestamp or delete the header text. but when I change ANYTHING it just disregards the linebreaker argument and goes back to be one huge event again and I can't do anything about it.

- even if I could linebreak and extract everything as stated, I don't really understand what they mean with "broken out using SPATH mean". do they mean via SPL ? cause they clearly stated that Splunk should "auto extract the fields and values"

How the data looks:

<?xml version="1.0" encoding="UTF-8" ?><dataroot><Interceptor><AttackCoords>-80.33100097073213,25.10742916222947</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>23</Infiltrators><Enforcer>Ironwood</Enforcer><ActionDate>2013-04-24</ActionDate><ActionTime>00:07:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-80.23429525620114,24.08680387475695</LaunchCoords><AttackVessel>Rustic</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.14622349209523,24.53605142362535</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>6</Infiltrators><Enforcer>Cunningham</Enforcer><ActionDate>2013-04-26</ActionDate><ActionTime>00:23:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords></LaunchCoords><AttackVessel>Raft</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.75496221688965,24.72483828554483</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>11</Infiltrators><Enforcer>Forthright</Enforcer><ActionDate>2013-05-15</ActionDate><ActionTime>23:35:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-79.65932674368925,23.70743135623052</LaunchCoords><AttackVessel>Rustic</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.32020594311533,25.02156920297054</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>6</Infiltrators><Enforcer>Pompano</Enforcer><ActionDate>2013-02-25</ActionDate><ActionTime>15:35:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords></LaunchCoords><AttackVessel>Raft</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.15149489716094,24.57412215015249</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>6</Infiltrators><Enforcer>Tripoteur</Enforcer><ActionDate>2013-04-13</ActionDate><ActionTime>15:40:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-79.65999190070923,23.73619147168514</LaunchCoords><AttackVessel>Raft</AttackVessel></Interceptor></dataroot>

I hope someone can help understand how to proceed here.

EDIT:

in Lab 4 there was almost the same data to input - the only difference is that in lab6 it has no linebreaks whatsoever. here is my props.conf from lab4:

[dreamcrusher]
DATETIME_CONFIG = 
FIELD_HEADER_REGEX = <Interceptor>
LINE_BREAKER = \<Interceptor\>
MAX_DAYS_AGO = 4000
NO_BINARY_CHECK = true
category = Custom
disabled = false
pulldown_type = true
REPORT-actiondate = actiondate
EVAL-_time = strptime(ActionDate +" " + ActionTime,"%Y-%m-%d %H:%M:%S")

and my transforms.conf:

#[actiondate]
#REGEX = \<ActionDate\>(?P<ActionDate>\d+-\d+-\d+)\<\/ActionDate\>\s*\<ActionTime\>(?P<ActionTime>\d+:\d+:\d+)
#FORMAT = $1::$2

 

Labels (4)
0 Karma
1 Solution

avoelk
Communicator

alright so apparently the GUI is sometimes buggy when you try to change a sourcetype. so to do more than just the linebreak - especially the deletion of the header - I did this: 

since it's a huge one line event and has no breakt the FIELD_HEADER_REGEX doesn't work here. what I did was: 

props.conf

TRANSFORMS-t1 = extraction

transforms.conf

[extraction]
REGEX = \<\?xml\sversion="\d.\d"\sencoding="UTF-8"\s\?\>\<dataroot\>
DEST_KEY = queue
FORMAT = nullQueue

this captured the whole xml lalala crap until the actual event begins. since there is <dataroot> at the beginning AND end, this deletes both.

to extract the necessary ActionDate and ActionTime and put it together into a new timestamp I did the following: 

props.conf

REPORT-actiondate = actiondate
EVAL-_time = strptime(ActionDate +" " + ActionTime,"%Y-%m-%d %H:%M:%S")

transforms.conf

[actiondate]
REGEX = \<ActionDate\>(?P<ActionDate>\d+-\d+-\d+)\<\/ActionDate\>\s*\<ActionTime\>(?P<ActionTime>\d+:\d+:\d+)
FORMAT = $1::$2

 

I still don't understand the part "broken out using spath mean" so I figured I'd do it with .. well, spath via SPL:

index=myindex sourcetype=mysourcetype |spath input=_raw path=

 

that did it for me

 

View solution in original post

0 Karma

avoelk
Communicator

alright so apparently the GUI is sometimes buggy when you try to change a sourcetype. so to do more than just the linebreak - especially the deletion of the header - I did this: 

since it's a huge one line event and has no breakt the FIELD_HEADER_REGEX doesn't work here. what I did was: 

props.conf

TRANSFORMS-t1 = extraction

transforms.conf

[extraction]
REGEX = \<\?xml\sversion="\d.\d"\sencoding="UTF-8"\s\?\>\<dataroot\>
DEST_KEY = queue
FORMAT = nullQueue

this captured the whole xml lalala crap until the actual event begins. since there is <dataroot> at the beginning AND end, this deletes both.

to extract the necessary ActionDate and ActionTime and put it together into a new timestamp I did the following: 

props.conf

REPORT-actiondate = actiondate
EVAL-_time = strptime(ActionDate +" " + ActionTime,"%Y-%m-%d %H:%M:%S")

transforms.conf

[actiondate]
REGEX = \<ActionDate\>(?P<ActionDate>\d+-\d+-\d+)\<\/ActionDate\>\s*\<ActionTime\>(?P<ActionTime>\d+:\d+:\d+)
FORMAT = $1::$2

 

I still don't understand the part "broken out using spath mean" so I figured I'd do it with .. well, spath via SPL:

index=myindex sourcetype=mysourcetype |spath input=_raw path=

 

that did it for me

 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...