Hi Team,
I’m struck in parsing the data, please advise how to handle the data.
In the log of an application a particular filed is alone containing huge data, so the lines are broken into new lines, so the splunk is also considering every single new lines as a new event
Sample data :
Please have a look in the image i have attached for sample data as it contains tags i'm not able to post here.
The above data is comma separated data and its field names are :
AUD_SEQ_NO PACKAGE_NAME SERVICE_NAME AUDIT_TIME EVENT_NAME SHORT_TEXT LONG_TEXT AUDIT_DATA CONSUMER_ID MESSAGE_ID CONTEXT_ID USER_NAME USER_ID USER_CONTEXT COMPANY_ID VERSION SESSION_ID CHANNEL_ID BUSINESSUNIT_ID SERVER_NAME
I have added a snap for better understanding.
Generally speaking, you need to define the event boundaries (event beginning and end) and date/time configurations in your props.conf file. Since this in not a single line event, you may need to use the more expensive (in processing) line break definitions such as "BREAK_ONLY_BEFORE" or "MUST BREAK_AFTER" and the regex required to identify the event boundaries. You also need to define the date/time location in your data (TIME_PREFIX and/or TIME_STAMP_LOOKAHEAD) , and then the date/time format (TIME_FORMAT). You may also need to modify the TRUNCATE or MAX_EVENTS defaults if the event can be large (10000 bytes for TRUNCATE and 256 lines for MAX_EVENTS).
Going through the Admin guide in the props.conf specifications should help you along this process.
http://docs.splunk.com/Documentation/Splunk/7.1.1/Admin/Propsconf
This is just a starting point, not much more can be done without a data sample.
Generally speaking, you need to define the event boundaries (event beginning and end) and date/time configurations in your props.conf file. Since this in not a single line event, you may need to use the more expensive (in processing) line break definitions such as "BREAK_ONLY_BEFORE" or "MUST BREAK_AFTER" and the regex required to identify the event boundaries. You also need to define the date/time location in your data (TIME_PREFIX and/or TIME_STAMP_LOOKAHEAD) , and then the date/time format (TIME_FORMAT). You may also need to modify the TRUNCATE or MAX_EVENTS defaults if the event can be large (10000 bytes for TRUNCATE and 256 lines for MAX_EVENTS).
Going through the Admin guide in the props.conf specifications should help you along this process.
http://docs.splunk.com/Documentation/Splunk/7.1.1/Admin/Propsconf
This is just a starting point, not much more can be done without a data sample.
You need to adjust the props.conf settings for that sourcetype so multiple lines are treated as a single event. Post the relevant props.conf stanza and we can help you make the necessary changes.
Hi Richgalloway,
Thanks for the effort 🙂 can you please help me in understanding the changes i need to do in the "PROPS.CONF" file ??
Regards,
Vigneshprasanna R
@Vigneshprasanna, as stated by @richgalloway, the configuration would be strictly based on your data pattern. (like which pattern to identify Timestamp, what is the timestamp format, which pattern to allow Event Breaks etc).
So in order to assist you better you would need to provide us with sample data(Please mask out/anonymize any sensitive information before posting). You should also provide us with your exisiting props.conf.
Following are the couple of resources that should help you figure out correct props.conf settings:
https://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configureeventlinebreaking
You should also take advantage of adding a sample file in Data Preview Mode to apply and test the props.conf settings for timestamp recognition and event breaks before indexing the data. Once data is indexed incorrectly it can not be (easily) corrected.
https://docs.splunk.com/Documentation/Splunk/latest/Data/Setsourcetype
Hi ,
You can define the line breaker while indexing data manually from a file. There you can define the identifier for a new line using regex.