Hi Splunkers ,
I'm collecting logs from S3 through heavy forwarder which are in json format . After indexing i see the logs in below format. i want fields inside the message field to be extracted into individual fields.
{ [-]
@timestamp: 2021-03-08T12:55:42.959Z
@version: 1
host: XX.XXX.XXX.XXX
message: <171>Mar 08 13:09:22 LOGSTASH[-]: {"@version":"1","facility_label":"zyx","program":"CRON","logtype":"syslog-prod","priority":86,"tags":["_grokparsefailure"],"pid":"1234","vmd_name":"abc","host":"XX.XXX.XXX.XXX","severity":6,"facility":10,"beat":{"name":"zxz"},"@timestamp":"","type":"xyz","timestamp":"Mar 8 13:09:22","logsource":"abc","severity_label":"Informational","message":"abc: session closed for user root\n"}
port: 1234
}
I have tried the following transforms config at HF and it didn't work:
props.conf
[aws:s3]
TRANSFORMS-xyz= s3-trans
transforms.conf
[s3-trans]
REGEX = [\"|\@](\w+)\":[\s]*([^\,\}]+)
FORMAT = $1::$2
Hi @spl_unker,
Can you please try by adding WRITE_META = true ?
props.conf
[aws:s3]
TRANSFORMS-xyz= s3-trans
transforms.conf
[s3-trans]
REGEX = [\"|\@](\w+)\":[\s]*([^\,\}]+)
FORMAT = $1::$2
WRITE_META = true
Hi @scelikok it didnt work. Fields are not getting extracted.
Hi @scelikok @isoutamo As an alternate option i want to remove the the starting characters(marked in Red) from the below log sample .
This gets appended in all the logs and i want to remove it before indexing . Could you please help with the props and transforms config
Raw Event :
{"message":"<100>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {\"@version\":\"1\",\"facility_label\":\"user-level\",\"program\":\"audispd\",\"logtype\":\"syslog\",\"priority\":14,\"tags\":[\"_grokparsefailure\"],\"vmd_name\":\"abc\",\"host\":\"XX.XXX.XXX.XXX\",\"severity\":6,\"facility\":1,\"Hostname\":\"abc\",\"beat\":{\"name\":\"abc\"},\"@timestamp\":\"2021-03-11T15:58:48.000Z\",\"type\":\"abc\",\"timestamp\":\"Mar 11 15:58:48\",\"logsource\":\"abc\",\"severity_label\":\"Informational\",\"message\":\"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\\n\"}","@timestamp":"2021-03-11T15:50:46.242Z","host":"XX.XXX.XXX.XXX","@version":"1","port":00000}
Hi @spl_unker ,
To remove the starting line of the string use SEDCMD in props.conf
props.conf
SEDCMD-reovingmesagestring =s/{"message":"\<\d+\>//g
I am not sure how your data looks like. So if above one doesn't work use the below one.
SEDCMD-string2=s/message:\s+\<\d+\>//g
@Vardhan Your first regex looks good as it is capturing the intended characters . But after adding that in props.conf at HF .Im still seeing the characters and it is not removed.
@spl_unker what about the second regex?
Second Regex i didnt try as the regex is not capturing the intended characters. If you look at raw the log , i just want to parse and remove the {"message":"<100> from the raw log and start with the timestamp instead
@spl_unker can you share the screenshot of logs in index and sample log in text format?
And after placing the first regex did you restarted the HF? And did u wait for the new data to come into the index? The props. conf settings will only apply for the new data and not for the existing data in the index.
@Vardhan There are 2 things I'm looking at .
1. Auto extraction of fields at index time . Check the formatted log file snapshot. It has a nested json .(All the fields inside the message field needs to be extracted)
if option 1 is not possible . Need help in 2nd option
2. Parse the incoming raw logs and remove the {"message":"<100> at the beginning of each event. Check raw log snapshot for reference.
Sample log can be seen in other comments.
Thanks in Advance
@spl_unkerCan you try the second regex which I gave and restart the HF and see the result how it is working for new data?
Can you paste _raw event data here? Just open event -> Event actions -> Show Source
Raw Event :
{"message":"<171>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {\"@version\":\"1\",\"facility_label\":\"user-level\",\"program\":\"audispd\",\"logtype\":\"syslog\",\"priority\":14,\"tags\":[\"_grokparsefailure\"],\"vmd_name\":\"abc\",\"host\":\"XX.XXX.XXX.XXX\",\"severity\":6,\"facility\":1,\"Hostname\":\"abc\",\"beat\":{\"name\":\"abc\"},\"@timestamp\":\"2021-03-11T15:58:48.000Z\",\"type\":\"abc\",\"timestamp\":\"Mar 11 15:58:48\",\"logsource\":\"abc\",\"severity_label\":\"Informational\",\"message\":\"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\\n\"}","@timestamp":"2021-03-11T15:50:46.242Z","host":"XX.XXX.XXX.XXX","@version":"1","port":00000}
and this is how it look syntaxed format:
{ [-]
@timestamp: 2021-03-11T15:50:46.242Z
@version: 1
host: XX.XXX.XXX.XXX
message: <171>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {"@version":"1","facility_label":"user-level","program":"audispd","logtype":"syslog","priority":14,"tags":["_grokparsefailure"],"vmd_name":"abc","host":"XX.XXX.XXX.XXX","severity":6,"facility":1,"Hostname":"abc","beat":{"name":"abc"},"@timestamp":"2021-03-11T15:58:48.000Z","type":"abc","timestamp":"Mar 11 15:58:48","logsource":"abc","severity_label":"Informational","message":"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\n"}
port: 0000
}