Getting Data In

index time extraction for json logs

spl_unker
Explorer

Hi Splunkers ,

I'm collecting logs from S3 through heavy forwarder which are in json format . After indexing i see the logs in below format. i want fields inside the message field to be extracted into individual fields. 

{ [-]
@timestamp: 2021-03-08T12:55:42.959Z
@version: 1
host: XX.XXX.XXX.XXX
message: <171>Mar 08 13:09:22 LOGSTASH[-]: {"@version":"1","facility_label":"zyx","program":"CRON","logtype":"syslog-prod","priority":86,"tags":["_grokparsefailure"],"pid":"1234","vmd_name":"abc","host":"XX.XXX.XXX.XXX","severity":6,"facility":10,"beat":{"name":"zxz"},"@timestamp":"","type":"xyz","timestamp":"Mar 8 13:09:22","logsource":"abc","severity_label":"Informational","message":"abc: session closed for user root\n"}
port: 1234
}

 

I have tried the  following transforms config at HF and it didn't work:

 

props.conf

[aws:s3]
TRANSFORMS-xyz= s3-trans

transforms.conf

[s3-trans]
REGEX = [\"|\@](\w+)\":[\s]*([^\,\}]+)
FORMAT = $1::$2

Labels (3)
Tags (1)
0 Karma

scelikok
Champion

Hi @spl_unker,

Can you please try by adding WRITE_META = true ?

props.conf

[aws:s3]
TRANSFORMS-xyz= s3-trans

transforms.conf

[s3-trans]
REGEX = [\"|\@](\w+)\":[\s]*([^\,\}]+)
FORMAT = $1::$2
WRITE_META = true

 

If this reply helps you an upvote is appreciated.
0 Karma

spl_unker
Explorer

Hi @scelikok  it didnt work. Fields are not getting extracted.

0 Karma

spl_unker
Explorer

 

 

Hi @scelikok  @soutamo As an alternate option i want to remove the the starting characters(marked in Red) from the below log sample .

 

This gets appended in all the logs and i want to remove it before indexing . Could you please help with the props and transforms config

 

Raw Event :

{"message":"<100>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {\"@version\":\"1\",\"facility_label\":\"user-level\",\"program\":\"audispd\",\"logtype\":\"syslog\",\"priority\":14,\"tags\":[\"_grokparsefailure\"],\"vmd_name\":\"abc\",\"host\":\"XX.XXX.XXX.XXX\",\"severity\":6,\"facility\":1,\"Hostname\":\"abc\",\"beat\":{\"name\":\"abc\"},\"@timestamp\":\"2021-03-11T15:58:48.000Z\",\"type\":\"abc\",\"timestamp\":\"Mar 11 15:58:48\",\"logsource\":\"abc\",\"severity_label\":\"Informational\",\"message\":\"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\\n\"}","@timestamp":"2021-03-11T15:50:46.242Z","host":"XX.XXX.XXX.XXX","@version":"1","port":00000}

 

0 Karma

Vardhan
Contributor

Hi @spl_unker ,

To remove the starting line of the string use SEDCMD in props.conf

props.conf

SEDCMD-reovingmesagestring =s/{"message":"\<\d+\>//g

I am not sure how your data looks like. So if above one doesn't work use the below one.

SEDCMD-string2=s/message:\s+\<\d+\>//g

0 Karma

spl_unker
Explorer

@Vardhan Your first regex looks good as it is capturing the intended characters . But after adding that in props.conf at HF .Im still seeing the characters and it is not removed.

0 Karma

Vardhan
Contributor

@spl_unker what about the second regex?

0 Karma

spl_unker
Explorer

Second Regex i didnt try as the regex is not capturing the intended characters. If you look at raw the log , i just want to parse and remove the  {"message":"<100>  from the raw log and start with the timestamp instead

0 Karma

Vardhan
Contributor

@spl_unker can you share the screenshot of logs in index and sample log in text format?

And after placing the first regex did you restarted the HF? And did u wait for the new data to come into the index? The props. conf settings will only apply for the new data and not for the existing data in the index.

0 Karma

spl_unker
Explorer

@Vardhan  There are 2 things I'm looking at .

1. Auto extraction of fields at index time . Check the formatted log file snapshot. It has a nested json .(All the fields inside the message field needs to be extracted)

if option 1 is not possible .  Need help in 2nd option

 

2. Parse the incoming raw logs and remove  the {"message":"<100>  at the beginning of each event. Check raw log snapshot for reference.

 

Sample log  can be seen in other comments.

 

Thanks in Advance

 

0 Karma

Vardhan
Contributor

@spl_unkerCan you try the second regex which I gave and restart the HF and see the result how it is working for new data?

Tags (1)
0 Karma

spl_unker
Explorer

Raw log format attached

Tags (1)
0 Karma

spl_unker
Explorer

@scelikok  , @koshyk Any help on this topic please ?

0 Karma

soutamo
SplunkTrust
SplunkTrust

Can you paste _raw event data here? Just open event -> Event actions -> Show Source

soutamo_0-1615475644533.png

 

0 Karma

spl_unker
Explorer

Raw Event :

{"message":"<171>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {\"@version\":\"1\",\"facility_label\":\"user-level\",\"program\":\"audispd\",\"logtype\":\"syslog\",\"priority\":14,\"tags\":[\"_grokparsefailure\"],\"vmd_name\":\"abc\",\"host\":\"XX.XXX.XXX.XXX\",\"severity\":6,\"facility\":1,\"Hostname\":\"abc\",\"beat\":{\"name\":\"abc\"},\"@timestamp\":\"2021-03-11T15:58:48.000Z\",\"type\":\"abc\",\"timestamp\":\"Mar 11 15:58:48\",\"logsource\":\"abc\",\"severity_label\":\"Informational\",\"message\":\"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\\n\"}","@timestamp":"2021-03-11T15:50:46.242Z","host":"XX.XXX.XXX.XXX","@version":"1","port":00000}

 

 

and this is how it look syntaxed format:

 

{ [-]
@timestamp: 2021-03-11T15:50:46.242Z
@version: 1
host: XX.XXX.XXX.XXX
message: <171>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {"@version":"1","facility_label":"user-level","program":"audispd","logtype":"syslog","priority":14,"tags":["_grokparsefailure"],"vmd_name":"abc","host":"XX.XXX.XXX.XXX","severity":6,"facility":1,"Hostname":"abc","beat":{"name":"abc"},"@timestamp":"2021-03-11T15:58:48.000Z","type":"abc","timestamp":"Mar 11 15:58:48","logsource":"abc","severity_label":"Informational","message":"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\n"}
port: 0000
}

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.