Getting Data In

Using a transform i cant use SEDCMD

robertlynch2020
Motivator

Hi

I am taking in data and making a new source type, so i need to use a transform for this.
The issue is when i use this i cant seem to use SEDCMD to trim some of the lines i am taking in.

Props.conf
[AMBER_RAW:METRIC]
SEDCMD-remove_header = s/^.*?{/{/1
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_PREFIX = \"ts\":\"
INDEXED_EXTRACTIONS = JSON

Transform.conf
[AMBER_RAW_json_METRIC]
DEST_KEY = MetaData:Sourcetype
REGEX = {"v":"1.0\"
FORMAT = sourcetype::AMBER_RAW:METRIC

If i don't use the Transform the SEDCMD-remove_header works, but in the case i need to, for another issue i am having.

Any ideas, how to get around this?
Example of the data, but i have it working if i take it in difectly. However i have to use a transform in this case as i have multiple sourcetype in one files.

2018-01-10 15:50:03 [metrics-application-1-thread-1] INFO  METRIC:41 - {"v":"1.0","t":"MTR","ts":"2018-01-10T15:50:03.704Z","h":"mx7654vm","pid":12483,"src":{"c":"authn-app","d":"auth"},"mtr":{"counters":{"process":{"cpu":{"time_cumulated_s":35},"memory":{"gc":{"ps_marksweep":{"total_duration_ms":814},"ps_scavenge":{"total_duration_ms":539}}}}},"gauges":{"com.murex.serviceframework.rest.datalayer.DataSourceMetrics.datasources.authn-authn-app-1":{"availableConnectionCount":1,"borrowedConnectionCount":0,"currPoolSize":1,"maxPoolSize":50,"poolName":"authn-authn-app-1"},"process":{"cpu":{"percentage":0.0014801778579070887},"files":{"open_files":37},"memory":{"jvm":{"heap":{"committed_kb":195072,"used_kb":111654},"nonheap":{"committed_kb":91456,"used_kb":89829}},"rss_kb":32880864,"vsz_kb":2295108}}},"histograms":{},"meters":{},"timers":{"process":{"memory":{"gc":{"ps_marksweep":{"events":{"count":1,"rate_1m":0.010541994097562058,"rate_5m":0.0030413993186727347,"rate_15m":0.001077675326868502,"rate_mean":0.023586525047214212},"duration_ms":{"max":620.0,"mean":620.0,"median":620.0,"min":620.0,"percentile_75":620.0,"percentile_95":620.0,"percentile_98":620.0,"percentile_99":620.0,"percentile_999":620.0,"standard_deviation":0.0}},"ps_scavenge":{"events":{"count":32,"rate_1m":1.3370365746688775,"rate_5m":1.8460208181687348,"rate_15m":1.9473463934310977,"rate_mean":0.7547234936660224},"duration_ms":{"max":18.0,"mean":9.125,"median":6.5,"min":3.0,"percentile_75":13.0,"percentile_95":18.0,"percentile_98":18.0,"percentile_99":18.0,"percentile_999":18.0,"standard_deviation":5.014495118187132}}}}}}}}

Thanks, in advance
Rob

0 Karma
1 Solution

mayurr98
Super Champion

Hello

The problem is that the 2 rules are for index time.
The first transform will apply renaming the sourcetype original to new(AMBER_RAW:METRIC).
But the event will not be parsed a second time (at index time) for the new sourcetype rules. Therefore the SEDCMD will never happen.

NOTE : at search time. the new sourcetype rules may stilll apply ( by example a field extraction)
so you will need to use sedcmd at search time like this

index=<your_index> sourcetype=AMBER_RAW:METRIC |  rex mode=sed "s/^.*?{/{/1"

As per below doc
http://wiki.splunk.com/Community:HowIndexingWorks
Both SEDCMD and transforms.conf occurs during 'Typing' queue process and since data is coming for original sourcetype so configuration for original sourcetype will take effect and configuration (index-time) for new sourcetype(AMBER_RAW:METRIC ) will never take place.

let me know if this helps!

View solution in original post

0 Karma

jconger
Splunk Employee
Splunk Employee

@mayurr98 is correct about the order of operations in the indexing pipeline. To get around this, you could use CLONE_SOURCETYPE in transforms.conf. According to the documentation, "The cloned event will be further processed by index-time transforms and SEDCMD expressions according to its new sourcetype."

0 Karma

robertlynch2020
Motivator

HI

Thanks for this, the first answer did work for me. But this is good to know for the future.

Cheers
Robert

0 Karma

mayurr98
Super Champion

Hello

The problem is that the 2 rules are for index time.
The first transform will apply renaming the sourcetype original to new(AMBER_RAW:METRIC).
But the event will not be parsed a second time (at index time) for the new sourcetype rules. Therefore the SEDCMD will never happen.

NOTE : at search time. the new sourcetype rules may stilll apply ( by example a field extraction)
so you will need to use sedcmd at search time like this

index=<your_index> sourcetype=AMBER_RAW:METRIC |  rex mode=sed "s/^.*?{/{/1"

As per below doc
http://wiki.splunk.com/Community:HowIndexingWorks
Both SEDCMD and transforms.conf occurs during 'Typing' queue process and since data is coming for original sourcetype so configuration for original sourcetype will take effect and configuration (index-time) for new sourcetype(AMBER_RAW:METRIC ) will never take place.

let me know if this helps!

0 Karma

robertlynch2020
Motivator

Hi

Thanks. Do you think i can move it from one transform to another so i can take out this data?

Is there anything that can be done to help in this case.

I need to push this data into a datamodel this is why i need this.

Thanks
Robert

0 Karma

mayurr98
Super Champion

again it will conflict. well what you can do is try applying sedcmd and transforms on the original sourcetype and see what happens.this should work as you are applying configuration on the original sourcetype.

0 Karma

robertlynch2020
Motivator

hi
Thanks for this this worked 🙂

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...