Splunk Search

How to break a single event into multiple?

pench2k19
Explorer

Hi team,

I have the following as a single event in splunk.

)V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 )V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586 success )V 2019-03-11 msp sanitized  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311( consumer_msp_sanitized.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 

I want to break it into threee different as follows

)V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000    

)V 2019-03-11 msp raw  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586 success   

)V 2019-03-11 msp sanitized  utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311( consumer_msp_sanitized.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 

Can you please suggest a way to accomplish it

@jkat54

0 Karma

harsmarvania57
Ultra Champion

Hi @pench2k19,

You can use below props.conf to break data at every )V on your Indexer/Heavy Forwarder so data will break properly at index time. Unfortunately events only contain 2019-03-11 no hour, minute and second so I'll suggest to use DATETIME_CONFIG = CURRENT in below props.conf to index event at Indexer Server time instead of timestamp from actual events. If you want to use time from events like this 2019-03-12 06:10:21.98032720 as event time then let us know and I'll provide props.conf with TIME_PREFIX and TIME_FORMAT parameter.

[yoursourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=()\)V\s
NO_BINARY_CHECK=true
0 Karma

pench2k19
Explorer

hi @harsmarvania57

thank you so much for the detailed information.

The data is actually in the format like below

\x00\x00\xFFV\x00\x00\x00\x00\x00\x002019-03-12\x00\x00\x00msp\x00\x00\x00raw\x00\x00\x00utility_extract13L\x00\x00\x00hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190312"\x00\x00\x00consumer_msp_raw.utility_extract13\x00\x00\x00utility_extract13_DELTA_9362019-03-13 06:08:52.6833482019-03-13 06:24:40.252295:\x00\x00\x00warning - 35% data volume threshold reached, expected 2000\x00\x00\xFFV\x00\x00\x00\x00\x00\x002019-03-12\x00\x00\x00msp\x00\x00\x00raw\x00\x00\x00utility_extract13L\x00\x00\x00hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190312"\x00\x00\x00consumer_msp_raw.utility_extract13\x00\x00\x00utility_extract13_DELTA_9362019-03-13 06:08:52.6833482019-03-13 06:24:40.252295\x00\x00\x00success\x00\x00\xFFV\x00\x00\x00\x00\x00\x002019-03-12\x00\x00\x00msp \x00\x00\x00sanitized\x00\x00\x00utility_extract13L\x00\x00\x00hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190312(\x00\x00\x00consumer_msp_sanitized.utility_extract13\x00\x00\x00utility_extract13_DELTA_9362019-03-13 06:08:52.6833482019-03-13 06:24:40.252295:\x00\x00\x00warning - 35% data volume threshold reached, expected 2000

when i apply rex mode=sed "s/\\x00/ /g" it is coming like

)V 2019-03-11 msp raw utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000 )V 2019-03-11 msp raw utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311" consumer_msp_raw.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586 success )V 2019-03-11 msp sanitized utility_extract13L hdfs:/datalake/consumer/msp/raw/tmp/MSP_DELTA_PR936_UTILITY_EXTRACT13_190311( consumer_msp_sanitized.utility_extract13 utility_extract13_DELTA_9362019-03-12 06:10:21.9803272019-03-12 06:26:09.014586: warning - 35% data volume threshold reached, expected 2000

Not sure , how to get the data in proper format. do you have any better thoughts?

0 Karma

pench2k19
Explorer

@harsmarvania57 any thoughts?

0 Karma

harsmarvania57
Ultra Champion

Not yet, I was trying to achieve the same output as you provided in my lab but when I ingest data with SEDCMD-null = s/\\x00//g in props.conf it still keeping \xFF, I am not sure how it is converting to ) in your case.

0 Karma

nickhills
Ultra Champion

This looks like you have some bad event breaking going on - not to mention some unpaired "s
Are you able to post the relevant stanza configuration from props.conf?

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...