Hi,
I want to use REGEX and FORMAT strings for an xml sample as given without using KV_MODE=xml
So i am trying to use different regex to get hold of parsing fields but failing
Please find the sample log for your reference and help
<Interceptor>
<AttackCoords>-80.03107887624853,25.351308629611</AttackCoords>
<Outcome>Interdiction</Outcome>
<Infiltrators>6</Infiltrators>
<Enforcer>Assured</Enforcer>
<ActionDate>2013-11-03</ActionDate>
<ActionTime>04:40:00</ActionTime>
<RecordNotes>Infiltrators:
Savanna Carrera,
Gregoria Farías,
Julina Abeyta,
Mariquita Alonso,
Urbano Briseño,
Victoro Montano </RecordNotes>
<NumEscaped>3</NumEscaped>
<LaunchCoords></LaunchCoords>
<AttackVessel>Raft</AttackVessel>
</Interceptor>
<Interceptor>
<AttackCoords>-80.33045250710296,24.93574264936793</AttackCoords>
<Outcome>Interdiction</Outcome>
<Infiltrators>9</Infiltrators>
<Enforcer>Pompano</Enforcer>
<ActionDate>2013-05-04</ActionDate>
<ActionTime>04:22:00</ActionTime>
<RecordNotes></RecordNotes>
<NumEscaped>0</NumEscaped>
<LaunchCoords>-80.30497342463124,24.07890526980327</LaunchCoords>
<AttackVessel>Rustic</AttackVessel>
</Interceptor>
<Interceptor>
<AttackCoords>-79.94720757796837,24.82172611548247</AttackCoords>
<Outcome>Interdiction</Outcome>
<Infiltrators>12</Infiltrators>
<Enforcer>Barracuda</Enforcer>
<ActionDate>2013-01-01</ActionDate>
<ActionTime>05:22:00</ActionTime>
<RecordNotes>Infiltrators:
Cristian Caballero,
Vicenta Olivares,
Leonides Cintrón,
Ascencion Betancourt,
Alanzo Arenas,
Primeiro Sánchez,
Serena Monroy,
Madina Mojica,
Consolacion Cordero,
Faqueza Serrano,
Grazia Quesada,
Ivette Partida </RecordNotes>
<NumEscaped>0</NumEscaped>
<LaunchCoords></LaunchCoords>
<AttackVessel>Rustic</AttackVessel>
</Interceptor>
Props.conf
[dreamcrusher]
LINE_BREAKER = (\<Interceptor\>)
TIME_PREFIX = <ActionDate>
TIME_FORMAT = %Y-%m-%d<\/ActionDate>[\r\n]\t+<ActionTime>%H:%M:%S
SHOULD_LINEMERGE = false
MAX_DAYS_AGO = 2500
SEDCMD-aremoveheader = s/\<\?xml.*\s*\<dataroot\>\s*//g
SEDCMD-bremovefooter = s/\<\/dataroot\>//g
REPORT-f = dream_attack
KV_MODE = none
transforms.conf
[dream_attack]
REGEX = (?m)^[^<]+.(.*?)\>([\S\s]*?)\<(?=[^\s])
FORMAT = $1::$2
Please suggest to me why am I failing?
Thanks
Use this transforms.conf instead
[dream_attack]
REGEX = \>\s+\<([^\>]+)\>([^\<]+)\<
FORMAT = $1::$2
REPEAT_MATCH = true
WRITE_META = true
Hello there,
Try adding ..| spath
at the end of your search.
hi nittala_surya,
Same error please
Search string used
index=* sourcetype="dreamcrusher" | rex field=_raw "^\s*<([^>])>([^<\/])" | spath
Error string
Error in 'rex' command: The regex '^\s*<([^>])>([^<\/])' does not extract anything. It should specify at least one named group. Format: (?...).
The search job has failed due to an error. You may be able view the job in the Job Inspector.
Get rid of rex. index= sourcetype="dreamcrusher" | spath
.
You can find more info about spath here.
On a side note: You regex doesn't have name capturing group. Hence the error.
Thanks nittala_surya,
It worked 🙂
However, just for my knowledge is it mandatory to use "| spath" to extract the fields while we are using transformation - REGEX and FORMAT in configuration files? OR it should format the _raw events (parse)the data using props and transforms? please suggest
No. spath
works only for search-time field extractions. To use props and transforms, the settings in your configuration files should be adjusted a little.
Give this a try:
Props.conf:
[dreamcrusher]
## Optional: Your setting will discard <Interceptor> from your events. To keep <Interceptor>, use below
LINE_BREAKER = ([\r\n])\<Interceptor\>
## Escape angular brackets in TIME_PREFIX
TIME_PREFIX = \<ActionDate\>
## TIME_FORMAT doesn't honor regex switches, use,
TIME_FORMAT = %Y-%m-%d</ActionDate>%n<ActionTime>%H:%M:%S
SHOULD_LINEMERGE = false
## Use this to improve efficiency while extracting timestamps
MAX_TIMESTAMP_LOOKAHEAD = 50
MAX_DAYS_AGO = 2500
SEDCMD-aremoveheader = s/\<\?xml.*\s*\<dataroot\>\s*//g
SEDCMD-bremovefooter = s/\<\/dataroot\>//g
REPORT-f = dream_attack
KV_MODE = none
Transforms.conf:
[dream_attack]
REGEX = (?m)^[^<]+\<+(.*?)\>([\S\s]*?)\<(?=[^\s])
FORMAT = $1::$2
MV_ADD = true
Use this regex instead
REGEX = ^\s*\<([^\>]*)\>([^\<\/]*)
Thanks, I have tried and no fields were extracted
For you to know i am using splunk enterprise on windows 10
can you try to use it in search?
your index|rex "^\s*<([^>])>([^<\/])"
index=* sourcetype="dream" | rex field=_raw "^\s*<([^>])>([^<\/])"
Getting error like as given below in the search
Error in 'rex' command: The regex '^\s*<([^>])>([^<\/])' does not extract anything. It should specify at least one named group. Format: (?...).
The search job has failed due to an error. You may be able view the job in the Job Inspector.
what is failing? extracting all the fields? extractiing the fields with multiple values (e.g.RecordNotes)?
extracting all the fields using multivalues
<Interceptor>
<AttackCoords>-80.03107887624853,25.351308629611</AttackCoords>
<Outcome>Interdiction</Outcome>
<Infiltrators>6</Infiltrators>
<Enforcer>Assured</Enforcer>
<ActionDate>2013-11-03</ActionDate>
<ActionTime>04:40:00</ActionTime>
<RecordNotes>Infiltrators:
Savanna Carrera,
Gregoria Farías,
Julina Abeyta,
Mariquita Alonso,
Urbano Briseño,
Victoro Montano </RecordNotes>
<NumEscaped>3</NumEscaped>
<LaunchCoords></LaunchCoords>
<AttackVessel>Raft</AttackVessel>
</Interceptor>
<Interceptor>
<AttackCoords>-80.33045250710296,24.93574264936793</AttackCoords>
<Outcome>Interdiction</Outcome>
<Infiltrators>9</Infiltrators>
<Enforcer>Pompano</Enforcer>
<ActionDate>2013-05-04</ActionDate>
<ActionTime>04:22:00</ActionTime>
<RecordNotes></RecordNotes>
<NumEscaped>0</NumEscaped>
<LaunchCoords>-80.30497342463124,24.07890526980327</LaunchCoords>
<AttackVessel>Rustic</AttackVessel>
</Interceptor>
<Interceptor>
<AttackCoords>-79.94720757796837,24.82172611548247</AttackCoords>
<Outcome>Interdiction</Outcome>
<Infiltrators>12</Infiltrators>
<Enforcer>Barracuda</Enforcer>
<ActionDate>2013-01-01</ActionDate>
<ActionTime>05:22:00</ActionTime>
<RecordNotes>Infiltrators:
Cristian Caballero,
Vicenta Olivares,
Leonides Cintrón,
Ascencion Betancourt,
Alanzo Arenas,
Primeiro Sánchez,
Serena Monroy,
Madina Mojica,
Consolacion Cordero,
Faqueza Serrano,
Grazia Quesada,
Ivette Partida </RecordNotes>
<NumEscaped>0</NumEscaped>
<LaunchCoords></LaunchCoords>
<AttackVessel>Rustic</AttackVessel>
</Interceptor>
[REPORT-dreamcrusher_extractions]
REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2