I'm trying to create a custom source type which is reading a TSV log file and the 3 column in the file is a JSON payload wrapped in quotes. I can't figure out how to get the source type to parse out the 3 column in a JSON format on splunk. Here's an example of a line entry below.
6680 "2020-03-06 13:50:13.254" "{"date":"3/6/2020 1:50:13 PM","received":"from FooServer (Unknown [172.20.36.5]) by smtp-dev.foo.com with ESMTP ; Fri, 6 Mar 2020 13:50:13 -0500","message-id":"id@message.com","from":"foo@thisMachine.com","recipients":"John.Smith@example.com","cc":"","subject":"Test Email"}"
Any advice would be helpful, thank you.
Add below content to transforms.conf. This will extract key value pair and index them.
transforms.conf.
[extract_json_fields]
REGEX = \"([\w-]+)\":\"([^"]*)\"
FORMAT = $1::$2
Refer extract_json_fields in props.conf under sourcetype stanza.
props.conf
[sourcetype_name]
TRANSFORMS-extract_fields = extract_json_fields
The documentation is EXTREMELY misleading in this regard. There is no such thing as creating a sourcetype
. All possible sourcetype
values "already exist" in any meaningful sense of the word exist
. There is NOTHING that must be done to create
one before using it. Just use it and poof it now exists and works.
This is not the right way to do it, however, assuming there is no nesting and no arrays in your JSON, you can get away with doing this:
Settings
-> Fields
-> Field Transformations
-> New Field Transformation
then
Set Name
to something like <YourSourcetype>_JSONpayload
.
Set Type
to regex-based
.
Set Regular expression
to (?<=[,{])"([^"]+)":"([^"]*)
.
Set Format
to $1::$2
.
Set Create multivalued fields
to checked
(e.g. true
/ yes
).
Click green Save
button.
Settings
-> Fields
-> Field Extractions
-> New Field Extraction
then:
Set Name
to <YourSourcetype>_JSONpayload
.
Set Apply to
to Sourcetype
+ named
= <YourSourcetype>
.
Set Type
to Uses transform
.
Set Extraction/Transform
to <YourSourcetype>_JSONpayload
.
Click green Save
button.
| makeresults
| eval _raw="6680 \"2020-03-06 13:50:13.254\" \"{\"date\":\"3/6/2020 1:50:13 PM\",\"received\":\"from FooServer (Unknown [172.20.36.5]) by smtp-dev.foo.com with ESMTP ; Fri, 6 Mar 2020 13:50:13 -0500\",\"message-id\":\" \",\"from\":\" foo@thisMachine.com\",\"recipients\":\" John.Smith@example.com\",\"cc\":\"\",\"subject\":\"Test Email\"}\""
| rex mode=sed "s/^.*(?={)(.*)\"/\1/g"
| spath
this is OK.
props.conf or Create source types >> advanced
[your sourcetype]
DATETIME_CONFIG = %F %T.%3Q
SEDCMD-trim_json = s/^.*(?={)(.*)\"/\1/g
KV_MODE = json
Add below content to transforms.conf. This will extract key value pair and index them.
transforms.conf.
[extract_json_fields]
REGEX = \"([\w-]+)\":\"([^"]*)\"
FORMAT = $1::$2
Refer extract_json_fields in props.conf under sourcetype stanza.
props.conf
[sourcetype_name]
TRANSFORMS-extract_fields = extract_json_fields
How would I go about this if I am using the UI to create the Source Type?
Add new setting EXTRACT-fields and put this below regex. Save and search sourcetype.
"date":"(?<date>[^\"]*)","received":"(?<received>[^\"]*)","message-id":"(?<messageid>[^\"]*)","from":"(?<from>[^\"]*)","recipients":"(?<recipients>[^\"]*)","cc":"(?<cc>[^\"]*)","subject":"(?<subject>[^\"]*)"