Getting Data In

Json datas selection and meta removal

pck_npluyaud
Engager

Hye.

The situation :  an instance of Splunk standalone (test platform), and an UF.
The data : JSON Stream with multi level.
The problem : the volume of data being important, we would like to reduce the _raw at only one field. But all JSON fields are saved as _meta.

We have succeeded to update source, sourcetype and host from the JSON datas.

But impossible to omit _meta ... (they always appear in the Search Head)

IN : 

{
"input":{
     "type":"log"},
"log":{
     "file":"c:\log.josn"},
"@metadata":{
     "beat":"filebeat",
     "version":"7.10.2"},
"message":"bla bla bla",
"fields":{
     "type":"bdc",
     "host":"VLCR03",
     "type2":"back"}
}

OUT : 

_raw  : "bla bla bla" <= OK
meta "input.***" <= to suppress
meta "log.***" <= to suppress
meta "@metadata.beat" <= to keep
meta "@metadata.version"<= to suppress
meta "message"<= to suppress
meta "fields.***" <= to suppress

props.conf on the UF

SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
CHARSET = AUTO
KV_MODE = none
AUTO_KV_JSON = false
INDEXED_EXTRACTIONS = JSON
TRANSFORMS-x = set_host set_source set_sourcetype
TRANSFORMS-y = extract_message
TRANSFORMS-z = remove_metadata

transforms.conf on the UF

[extract_message]
SOURCE_KEY = field:message
REGEX = (.*)
FORMAT = $1
DEST_KEY = _raw

[set_host]
SOURCE_KEY = field:fields.host
REGEX = (.*)
FORMAT = host::$1
DEST_KEY = MetaData:Host

[set_source]
SOURCE_KEY = field:log.file
REGEX = (.*)
FORMAT = source::$1
DEST_KEY = MetaData:Source

[set_sourcetype]
SOURCE_KEY = fields:fields.type,fields.type2
REGEX = (.*)\s(.*)
FORMAT = sourcetype::$1:$2
DEST_KEY = MetaData:Sourcetype

[remove_message]
SOURCE_KEY = _meta:message
REGEX = (.*)
DEST_KEY = queue
FORMAT = nullQueue

Labels (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

These props.conf settings MUST go on the first full Splunk Enterprise instance (HF or indexer) that sees the data.  The UF will ignore all of them.  Perhaps this is why the original props.conf settings didn't work.

To make the comma optional, insert a ? after the comma in the SEDCMD.

---
If this reply helps you, Karma would be appreciated.
0 Karma

pck_npluyaud
Engager

I put the rule i the props.conf in the UF. But same result , the _meta "input.type" is not removed.

Another point, the sedcmd should be general : the json schema can never be the same (comma non at the same place)

SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
CHARSET = AUTO
KV_MODE = none
AUTO_KV_JSON = false
INDEXED_EXTRACTIONS = JSON
TRANSFORMS-x = remove_events set_host set_source set_sourcetype
TRANSFORMS-y = extract_message
SEDCMD-noinput = s/"input":\{[\s\S]+?},//
a.png

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The FORMAT = nullQueue setting is for entire events, not individual pieces of text.  To remove text, I recommend using SEDCMD.  I also recommend retaining keywords to make parsing easier at search time.

Try these settings in props.conf:

SEDCMD-noinput = s/"input":\{[\s\S]+?},//
SEDCMD-nolog = s/"log":\{[\s\S]+?},//
SEDCMD-nofields = s/"fields":\{[\s\S]+?},?//
SEDCMD-noversion = s/,\s+"version":"[\s\S]+?"//
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...