Getting Data In

index time extraction for json logs

spl_unker
Explorer

Hi Splunkers ,

I'm collecting logs from S3 through heavy forwarder which are in json format . After indexing i see the logs in below format. i want fields inside the message field to be extracted into individual fields. 

{ [-]
@timestamp: 2021-03-08T12:55:42.959Z
@version: 1
host: XX.XXX.XXX.XXX
message: <171>Mar 08 13:09:22 LOGSTASH[-]: {"@version":"1","facility_label":"zyx","program":"CRON","logtype":"syslog-prod","priority":86,"tags":["_grokparsefailure"],"pid":"1234","vmd_name":"abc","host":"XX.XXX.XXX.XXX","severity":6,"facility":10,"beat":{"name":"zxz"},"@timestamp":"","type":"xyz","timestamp":"Mar 8 13:09:22","logsource":"abc","severity_label":"Informational","message":"abc: session closed for user root\n"}
port: 1234
}

 

I have tried the  following transforms config at HF and it didn't work:

 

props.conf

[aws:s3]
TRANSFORMS-xyz= s3-trans

transforms.conf

[s3-trans]
REGEX = [\"|\@](\w+)\":[\s]*([^\,\}]+)
FORMAT = $1::$2

Labels (3)
Tags (1)
0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @spl_unker,

Can you please try by adding WRITE_META = true ?

props.conf

[aws:s3]
TRANSFORMS-xyz= s3-trans

transforms.conf

[s3-trans]
REGEX = [\"|\@](\w+)\":[\s]*([^\,\}]+)
FORMAT = $1::$2
WRITE_META = true

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

spl_unker
Explorer

Hi @scelikok  it didnt work. Fields are not getting extracted.

0 Karma

spl_unker
Explorer

 

 

Hi @scelikok  @isoutamo As an alternate option i want to remove the the starting characters(marked in Red) from the below log sample .

 

This gets appended in all the logs and i want to remove it before indexing . Could you please help with the props and transforms config

 

Raw Event :

{"message":"<100>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {\"@version\":\"1\",\"facility_label\":\"user-level\",\"program\":\"audispd\",\"logtype\":\"syslog\",\"priority\":14,\"tags\":[\"_grokparsefailure\"],\"vmd_name\":\"abc\",\"host\":\"XX.XXX.XXX.XXX\",\"severity\":6,\"facility\":1,\"Hostname\":\"abc\",\"beat\":{\"name\":\"abc\"},\"@timestamp\":\"2021-03-11T15:58:48.000Z\",\"type\":\"abc\",\"timestamp\":\"Mar 11 15:58:48\",\"logsource\":\"abc\",\"severity_label\":\"Informational\",\"message\":\"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\\n\"}","@timestamp":"2021-03-11T15:50:46.242Z","host":"XX.XXX.XXX.XXX","@version":"1","port":00000}

 

0 Karma

Vardhan
Contributor

Hi @spl_unker ,

To remove the starting line of the string use SEDCMD in props.conf

props.conf

SEDCMD-reovingmesagestring =s/{"message":"\<\d+\>//g

I am not sure how your data looks like. So if above one doesn't work use the below one.

SEDCMD-string2=s/message:\s+\<\d+\>//g

0 Karma

spl_unker
Explorer

@Vardhan Your first regex looks good as it is capturing the intended characters . But after adding that in props.conf at HF .Im still seeing the characters and it is not removed.

0 Karma

Vardhan
Contributor

@spl_unker what about the second regex?

0 Karma

spl_unker
Explorer

Second Regex i didnt try as the regex is not capturing the intended characters. If you look at raw the log , i just want to parse and remove the  {"message":"<100>  from the raw log and start with the timestamp instead

0 Karma

Vardhan
Contributor

@spl_unker can you share the screenshot of logs in index and sample log in text format?

And after placing the first regex did you restarted the HF? And did u wait for the new data to come into the index? The props. conf settings will only apply for the new data and not for the existing data in the index.

0 Karma

spl_unker
Explorer

@Vardhan  There are 2 things I'm looking at .

1. Auto extraction of fields at index time . Check the formatted log file snapshot. It has a nested json .(All the fields inside the message field needs to be extracted)

if option 1 is not possible .  Need help in 2nd option

 

2. Parse the incoming raw logs and remove  the {"message":"<100>  at the beginning of each event. Check raw log snapshot for reference.

 

Sample log  can be seen in other comments.

 

Thanks in Advance

 

0 Karma

Vardhan
Contributor

@spl_unkerCan you try the second regex which I gave and restart the HF and see the result how it is working for new data?

Tags (1)
0 Karma

spl_unker
Explorer

Raw log format attached

Tags (1)
0 Karma

spl_unker
Explorer

@scelikok  , @koshyk Any help on this topic please ?

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Can you paste _raw event data here? Just open event -> Event actions -> Show Source

soutamo_0-1615475644533.png

 

0 Karma

spl_unker
Explorer

Raw Event :

{"message":"<171>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {\"@version\":\"1\",\"facility_label\":\"user-level\",\"program\":\"audispd\",\"logtype\":\"syslog\",\"priority\":14,\"tags\":[\"_grokparsefailure\"],\"vmd_name\":\"abc\",\"host\":\"XX.XXX.XXX.XXX\",\"severity\":6,\"facility\":1,\"Hostname\":\"abc\",\"beat\":{\"name\":\"abc\"},\"@timestamp\":\"2021-03-11T15:58:48.000Z\",\"type\":\"abc\",\"timestamp\":\"Mar 11 15:58:48\",\"logsource\":\"abc\",\"severity_label\":\"Informational\",\"message\":\"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\\n\"}","@timestamp":"2021-03-11T15:50:46.242Z","host":"XX.XXX.XXX.XXX","@version":"1","port":00000}

 

 

and this is how it look syntaxed format:

 

{ [-]
@timestamp: 2021-03-11T15:50:46.242Z
@version: 1
host: XX.XXX.XXX.XXX
message: <171>Mar 11 15:58:48 XX.XXX.XXX.XXX LOGSTASH[-]: {"@version":"1","facility_label":"user-level","program":"audispd","logtype":"syslog","priority":14,"tags":["_grokparsefailure"],"vmd_name":"abc","host":"XX.XXX.XXX.XXX","severity":6,"facility":1,"Hostname":"abc","beat":{"name":"abc"},"@timestamp":"2021-03-11T15:58:48.000Z","type":"abc","timestamp":"Mar 11 15:58:48","logsource":"abc","severity_label":"Informational","message":"node=abc type=SOCKADDR msg=audit(1615478328.279:1722168): saddr=000000000000000000000000\n"}
port: 0000
}

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...