Getting Data In

Issues with XML Linebreaker

avoelk
Communicator

Hello fellow splunkers,

right now I'm working through the 7 labs for SE II which are necessary to be able to start the finishing accreditation quiz. I've been able to finish 5 of them by now but am totally lost with lab 6. here the instructions are:

- events should begin with <Interceptor> and end with </Interceptor> (so Linebreaking is needed)

- Extract (at search time) all fields and values in between the Interceptor lines and throw away any of the header lines before the first <Interceptor> and the line after the very last </Interceptor>

- Use the ActionDate and ActionTime field as the timestamp

- have Splunk auto extract the fields and values

 

how they say I'd know I've done it:

- I'll have x amount of events and the fields broken out using SPATH notation

- the correct timestamp

- no text before the first and after the last Interceptor

 

What I have so far:

- I'm able to extract ActionDate and ActionTime to create a new timestamp

- I'm able to linebreak with LINE_BREAK = \<Interceptor\>()

 

My Issue:

- When I linebreak I save the new sourcetype and try to proceed to alter it given the other things to do like extract timestamp or delete the header text. but when I change ANYTHING it just disregards the linebreaker argument and goes back to be one huge event again and I can't do anything about it.

- even if I could linebreak and extract everything as stated, I don't really understand what they mean with "broken out using SPATH mean". do they mean via SPL ? cause they clearly stated that Splunk should "auto extract the fields and values"

How the data looks:

<?xml version="1.0" encoding="UTF-8" ?><dataroot><Interceptor><AttackCoords>-80.33100097073213,25.10742916222947</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>23</Infiltrators><Enforcer>Ironwood</Enforcer><ActionDate>2013-04-24</ActionDate><ActionTime>00:07:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-80.23429525620114,24.08680387475695</LaunchCoords><AttackVessel>Rustic</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.14622349209523,24.53605142362535</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>6</Infiltrators><Enforcer>Cunningham</Enforcer><ActionDate>2013-04-26</ActionDate><ActionTime>00:23:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords></LaunchCoords><AttackVessel>Raft</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.75496221688965,24.72483828554483</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>11</Infiltrators><Enforcer>Forthright</Enforcer><ActionDate>2013-05-15</ActionDate><ActionTime>23:35:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-79.65932674368925,23.70743135623052</LaunchCoords><AttackVessel>Rustic</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.32020594311533,25.02156920297054</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>6</Infiltrators><Enforcer>Pompano</Enforcer><ActionDate>2013-02-25</ActionDate><ActionTime>15:35:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords></LaunchCoords><AttackVessel>Raft</AttackVessel></Interceptor><Interceptor><AttackCoords>-80.15149489716094,24.57412215015249</AttackCoords><Outcome>Interdiction</Outcome><Infiltrators>6</Infiltrators><Enforcer>Tripoteur</Enforcer><ActionDate>2013-04-13</ActionDate><ActionTime>15:40:00</ActionTime><RecordNotes></RecordNotes><NumEscaped>0</NumEscaped><LaunchCoords>-79.65999190070923,23.73619147168514</LaunchCoords><AttackVessel>Raft</AttackVessel></Interceptor></dataroot>

I hope someone can help understand how to proceed here.

EDIT:

in Lab 4 there was almost the same data to input - the only difference is that in lab6 it has no linebreaks whatsoever. here is my props.conf from lab4:

[dreamcrusher]
DATETIME_CONFIG = 
FIELD_HEADER_REGEX = <Interceptor>
LINE_BREAKER = \<Interceptor\>
MAX_DAYS_AGO = 4000
NO_BINARY_CHECK = true
category = Custom
disabled = false
pulldown_type = true
REPORT-actiondate = actiondate
EVAL-_time = strptime(ActionDate +" " + ActionTime,"%Y-%m-%d %H:%M:%S")

and my transforms.conf:

#[actiondate]
#REGEX = \<ActionDate\>(?P<ActionDate>\d+-\d+-\d+)\<\/ActionDate\>\s*\<ActionTime\>(?P<ActionTime>\d+:\d+:\d+)
#FORMAT = $1::$2

 

Labels (4)
0 Karma
1 Solution

avoelk
Communicator

alright so apparently the GUI is sometimes buggy when you try to change a sourcetype. so to do more than just the linebreak - especially the deletion of the header - I did this: 

since it's a huge one line event and has no breakt the FIELD_HEADER_REGEX doesn't work here. what I did was: 

props.conf

TRANSFORMS-t1 = extraction

transforms.conf

[extraction]
REGEX = \<\?xml\sversion="\d.\d"\sencoding="UTF-8"\s\?\>\<dataroot\>
DEST_KEY = queue
FORMAT = nullQueue

this captured the whole xml lalala crap until the actual event begins. since there is <dataroot> at the beginning AND end, this deletes both.

to extract the necessary ActionDate and ActionTime and put it together into a new timestamp I did the following: 

props.conf

REPORT-actiondate = actiondate
EVAL-_time = strptime(ActionDate +" " + ActionTime,"%Y-%m-%d %H:%M:%S")

transforms.conf

[actiondate]
REGEX = \<ActionDate\>(?P<ActionDate>\d+-\d+-\d+)\<\/ActionDate\>\s*\<ActionTime\>(?P<ActionTime>\d+:\d+:\d+)
FORMAT = $1::$2

 

I still don't understand the part "broken out using spath mean" so I figured I'd do it with .. well, spath via SPL:

index=myindex sourcetype=mysourcetype |spath input=_raw path=

 

that did it for me

 

View solution in original post

0 Karma

avoelk
Communicator

alright so apparently the GUI is sometimes buggy when you try to change a sourcetype. so to do more than just the linebreak - especially the deletion of the header - I did this: 

since it's a huge one line event and has no breakt the FIELD_HEADER_REGEX doesn't work here. what I did was: 

props.conf

TRANSFORMS-t1 = extraction

transforms.conf

[extraction]
REGEX = \<\?xml\sversion="\d.\d"\sencoding="UTF-8"\s\?\>\<dataroot\>
DEST_KEY = queue
FORMAT = nullQueue

this captured the whole xml lalala crap until the actual event begins. since there is <dataroot> at the beginning AND end, this deletes both.

to extract the necessary ActionDate and ActionTime and put it together into a new timestamp I did the following: 

props.conf

REPORT-actiondate = actiondate
EVAL-_time = strptime(ActionDate +" " + ActionTime,"%Y-%m-%d %H:%M:%S")

transforms.conf

[actiondate]
REGEX = \<ActionDate\>(?P<ActionDate>\d+-\d+-\d+)\<\/ActionDate\>\s*\<ActionTime\>(?P<ActionTime>\d+:\d+:\d+)
FORMAT = $1::$2

 

I still don't understand the part "broken out using spath mean" so I figured I'd do it with .. well, spath via SPL:

index=myindex sourcetype=mysourcetype |spath input=_raw path=

 

that did it for me

 

0 Karma
Get Updates on the Splunk Community!

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...

Splunkbase | Splunk Dashboard Examples App for SimpleXML End of Life

The Splunk Dashboard Examples App for SimpleXML will reach end of support on Dec 19, 2024, after which no new ...

Understanding Generative AI Techniques and Their Application in Cybersecurity

Watch On-Demand Artificial intelligence is the talk of the town nowadays, with industries of all kinds ...