Getting Data In

Trying to extract fields from a hybrid log Syslog/JSON nested logs

atme
Loves-to-Learn Lots

Trying to extract some data from a hybrid log where the log format is <Syslog header> <JSON Data>.

Have had success with extracting via spath and regex in search but want to do this before ingestions, so trying to complete this on a heavy forwarder by using  props.conf and transforms.conf to complete the field extractions. Got this working to a degree but it only functions partly fuctions with some logs the the nested logs in msg are not full extracted and some logs don't extract anything for JSON.

An example of one of many log types but all in this format <Syslog header> <JSON Data>

Aug 3 04:45:01 server.name.local program {"_program":{"uid":"0","type":"newData","subj":"unconfined","pid":"4864","msg":"ab=new:session_create creator=sam,sam,echo,ba_permit,ba_umask,ba_limits acct=\"su\" exe=\"/usr/sbin/vi\" hostname=? addr=? terminal=vi res=success","auid":"0","UID":"user1","AUID":"user1"}}

creator=sam
stopping at first comma

acct=\
exe=\
Doesn't collect the data after \

And the following 2 logs had no field extractions from the json

Aug 3 04:31:01 server.name.local program  {"_program":{"uid":"0","type":"SYSCALL","tty":"pts1","syscall":"725","su":"0","passedsuccess":"yes","pass":"unconfined","id":"0","sess":"3417","pid":"4568732","msg":"utime(1754195461.112:457):","items":"2","gid":"0","fsuid":"0","fsgid":"0","exit":"3","exe":"/usr/bin/vi","euid":"0","egid":"0","comm":"vi","auid":"345742342","arch":"c000003e","a3":"1b6","a2":"241","a1":"615295291b60","a0":"ffffff9c","UID":"user1","SYSCALL":"openmat","SUID":"user1","SGID":"user1","GID":"user1","FSUID":"user1","FSGID":"user1","EUID":"user1","EGID":"user1","AUID":"user1","ARCH":"x86_64"}}


Aug 3 04:10:01 server.name.local program  {"_program":{"type":"data","data":"/usr/bin/vi","msg":"utime(1754194201.112:457):"}}

 

Thanks in advance for any help

0 Karma

tej57
Builder

Hey @atme,

It would be complex if you try to extract all of these fields at index time. The computational load would also increase. I would prefer going for search time extractions. However, if you still wish to extract fields at index time,  it would be great if you can share what you've configured till now in props and transforms. Since the _raw event varies in number of fields also,  we need to define a regex based pattern or key-value pair to extract the fields.

Thanks,
Tejas.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...