Splunk Search

How to break a single event into multiple?

pench2k19
Explorer

Hi team,

I have the following as a single event in splunk.

)V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 )V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586 success )V 2019-03-11 msp sanitized  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311( consumer_msp_sanitized.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 

I want to break it into threee different as follows

)V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000    

)V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586 success   

)V 2019-03-11 msp sanitized  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311( consumer_msp_sanitized.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 

Can you please suggest a way to accomplish it

@jkat54

0 Karma

harsmarvania57
Ultra Champion

Hi @pench2k19,

You can use below props.conf to break data at every )V on your Indexer/Heavy Forwarder so data will break properly at index time. Unfortunately events only contain 2019-03-11 no hour, minute and second so I'll suggest to use DATETIME_CONFIG = CURRENT in below props.conf to index event at Indexer Server time instead of timestamp from actual events. If you want to use time from events like this 2019-03-12 06:10:21.98032720 as event time then let us know and I'll provide props.conf with TIME_PREFIX and TIME_FORMAT parameter.

[yoursourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=()\)V\s
NO_BINARY_CHECK=true
0 Karma

pench2k19
Explorer

hi @harsmarvania57

thank you so much for the detailed information.

The data is actually in the format like below

\x00\x00\xFFV\x00\x00\x00\x00\x00\x002019-03-12\x00\x00\x00msp\x00\x00\x00raw\x00\x00\x00utility_extract13L\x00\x00\x00hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190312"\x00\x00\x00consumer_msp_raw.utility_extract13\x00\x00\x00utility_extract13_DELTA_9362019-03-13 06:08:52.6833482019-03-13 06:24:40.252295:\x00\x00\x00warning - 35% data volume threshold reached, expected 2000\x00\x00\xFFV\x00\x00\x00\x00\x00\x002019-03-12\x00\x00\x00msp\x00\x00\x00raw\x00\x00\x00utility_extract13L\x00\x00\x00hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190312"\x00\x00\x00consumer_msp_raw.utility_extract13\x00\x00\x00utility_extract13_DELTA_9362019-03-13 06:08:52.6833482019-03-13 06:24:40.252295\x00\x00\x00success\x00\x00\xFFV\x00\x00\x00\x00\x00\x002019-03-12\x00\x00\x00msp \x00\x00\x00sanitized\x00\x00\x00utility_extract13L\x00\x00\x00hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190312(\x00\x00\x00consumer_msp_sanitized.utility_extract13\x00\x00\x00utility_extract13_DELTA_9362019-03-13 06:08:52.6833482019-03-13 06:24:40.252295:\x00\x00\x00warning - 35% data volume threshold reached, expected 2000

when i apply rex mode=sed "s/\\x00/ /g" it is coming like

)V 2019-03-11 msp raw utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 )V 2019-03-11 msp raw utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586 success )V 2019-03-11 msp sanitized utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311( consumer_msp_sanitized.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000

Not sure , how to get the data in proper format. do you have any better thoughts?

0 Karma

pench2k19
Explorer

@harsmarvania57 any thoughts?

0 Karma

harsmarvania57
Ultra Champion

Not yet, I was trying to achieve the same output as you provided in my lab but when I ingest data with SEDCMD-null = s/\\x00//g in props.conf it still keeping \xFF, I am not sure how it is converting to ) in your case.

0 Karma

nickhills
Ultra Champion

This looks like you have some bad event breaking going on - not to mention some unpaired "s
Are you able to post the relevant stanza configuration from props.conf?

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...