Getting Data In
Highlighted

Remove first part of string before creating a JSON source type

Motivator

HI

I have used the below answer to get me 95% to a full solution, but i just cant get the last bit.
https://answers.splunk.com/answers/567087/how-to-split-data-into-separate-sourcetypes-with-t.html

I take in one file with multiple JSON and splits it into multiple source types.
However i have a sub issue, one of the source types is like below Text + JSON trace.

2018-01-10 15:52:03 [metrics-application-1-thread-1] INFO  METRIC:41 - {"v":"1.0","t":"MTR","ts":"2018-01-10T15:52:03.700Z","h":"mx7654vm","pid"

I am looking to get only the JSON and removing the other data (at the start).

So, i think i need a SED in the props? but not sure. I am trying not to use a heavy forwarder.

props.conf
[AMBERRAW]
SHOULD
LINEMERGE=false
NOBINARYCHECK=true
TRANSFORMS-sourcetyerouting = AMBERRAWjsonMETRIC

[AMBERRAW:METRIC]
TIME
FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIMEPREFIX = \"ts\":\"
INDEXED
EXTRACTIONS = JSON
SEDCMD-REGEX_ONLY = s/^.({"v".).*$/\1/

Transforms.conf
[AMBERRAWjsonMETRIC]
DEST
KEY = MetaData:Sourcetype
REGEX = {"v":"1.0\"
FORMAT = sourcetype::AMBER_RAW:METRIC

Thanks in Advance

Highlighted

Re: Remove first part of string before creating a JSON source type

Champion
0 Karma
Highlighted

Re: Remove first part of string before creating a JSON source type

Splunk Employee
Splunk Employee

Something like this should work in props.conf to remove the header text:

SEDCMD-remove_header = s/.*?\{/{/g

This matches everything up to (and including) the first {. Then, it replaces it all with just a {.

Note: this is an index-time extraction.

View solution in original post

Highlighted

Re: Remove first part of string before creating a JSON source type

Motivator

hi

Thanks for your help here, hower i cant get this to work.
I have tried this as the JSON is more complex

[AMBERRAW:METRIC]
SEDCMD-remove
header = s/.*?{"v/{"v/g
TIMEFORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME
PREFIX = \"ts\":\"
INDEXED_EXTRACTIONS = JSON

2018-01-10 15:52:03 [metrics-application-1-thread-1] INFO  METRIC:41 - {"v":"1.0","t":"MTR","ts":"2018-01-10T15:52:03.700Z","h":"mx7654vm","pid":12483,"src":{"c":"authn-app","d":"auth"},"mtr":{"counters":{"process":{"cpu":{"time_cumulated_s":36},"memory":{"gc":{"ps_marksweep":{"total_duration_ms":814},"ps_scavenge":{"total_duration_ms":539}}}}},"gauges":{"com.murex.serviceframework.rest.datalayer.DataSourceMetrics.datasources.authn-authn-app-1":{"availableConnectionCount":1,"borrowedConnectionCount":0,"currPoolSize":1,"maxPoolSize":50,"poolName":"authn-authn-app-1"},"process":{"cpu":{"percentage":0.04184450581638631},"files":{"open_files":37},"memory":{"jvm":{"heap":{"committed_kb":195072,"used_kb":115080},"nonheap":{"committed_kb":91456,"used_kb":89860}},"rss_kb":32880864,"vsz_kb":2301276}}},"histograms":{},"meters":{},"timers":{"process":{"memory":{"gc":{"ps_marksweep":{"events":{"count":1,"rate_1m":0.0014267037570722622,"rate_5m":0.002038710931305469,"rate_15m":9.431526926661993E-4,"rate_mean":0.006158257067925256},"duration_ms":{"max":620.0,"mean":620.0,"median":620.0,"min":620.0,"percentile_75":620.0,"percentile_95":620.0,"percentile_98":620.0,"percentile_99":620.0,"percentile_999":620.0,"standard_deviation":0.0}},"ps_scavenge":{"events":{"count":32,"rate_1m":0.18094822353052323,"rate_5m":1.237424759817615,"rate_15m":1.7042656064654065,"rate_mean":0.19706273351906517},"duration_ms":{"max":18.0,"mean":9.125,"median":6.5,"min":3.0,"percentile_75":13.0,"percentile_95":18.0,"percentile_98":18.0,"percentile_99":18.0,"percentile_999":18.0,"standard_deviation":5.014495118187132}}}}}}}}
0 Karma
Highlighted

Re: Remove first part of string before creating a JSON source type

Splunk Employee
Splunk Employee

Try this for your SEDCMD. It anchors the regex to the beginning of the line and sets the replace flag:

SEDCMD-remove_header = s/^.*?\{/{/1
0 Karma
Highlighted

Re: Remove first part of string before creating a JSON source type

Motivator

Hi

Thanks for your help

I have applied this, but i am still getting the full line into SPLUNK, not sure why as to me it should work.

[AMBERRAW:METRIC]
SEDCMD-remove
header = s/^.*?{/{/1
TIMEFORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME
PREFIX = \"ts\":\"
INDEXED_EXTRACTIONS = JSON

0 Karma
Highlighted

Re: Remove first part of string before creating a JSON source type

Motivator

Hi

I can confirm this work if you use the below + take the file in without using a transform

[AMBERRAW:METRICDIRECT]
SEDCMD-removeheader = s/^.*?{/{/1
TIME
FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIMEPREFIX = \"ts\":\"
INDEXED
EXTRACTIONS = JSON

However in my case as my source is coming from a transform it does work, so i will post a separate question on this. (Below does work, however the code is exactly the same, so it is a bug or i am missing something)

transforms.conf
[AMBERRAWjsonMETRIC]
DEST
KEY = MetaData:Sourcetype
REGEX = {"v":"1.0\"
FORMAT = sourcetype::AMBERRAWMETRIC

props.conf
[AMBERRAW:METRIC]
SEDCMD-remove
header = s/^.*?{/{/1
TIMEFORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME
PREFIX = \"ts\":\"
INDEXED_EXTRACTIONS = JSON

0 Karma