Getting Data In

After Cloning a sourcetype using TRANSFORMS-CLONE i cant get timestamp to be read on the new sourctype

robertlynch2020
Motivator

Hi

I am cloning a sourcetype twice. (Using TRANSFORMS-CLONE = CLONE_SOURCETYPE_JAVA,CLONE_SOURCETYPE_JAVA1)
Then in the transforms i define it

[CLONE_SOURCETYPE_JAVA1]
CLONE_SOURCETYPE = sun_jvm
REGEX = .

[CLONE_SOURCETYPE_JAVA]
CLONE_SOURCETYPE = GC11
REGEX = .

sun_jvm works but GC11 does not (it takes in all lines as one event), i have reduced it down to the timestamp that i think it causing the issue.

It looks that a small difference in the timestamp brackets [ ] is causing the issue for Slunk not to pick up the GC11 correctly,

Working one(sun_jvm)
2020-02-17T20:06:26.345+0100: 0.567: GC 9216K->4524K(32256K), 0.0132560 secs:
2020-02-17T20:06:26.345+0100: 0.567: GC 9216K->4524K(32256K), 0.0132560 secs:
2020-02-17T20:06:26.345+0100: 0.567: GC 9216K->4524K(32256K), 0.0132560 secs:

[sun_jvm]
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_PREFIX = ^
SHOULD_LINEMERGE = false
category = Custom
disabled = false
pulldown_type = true

Non working(GC11)
[2020-01-31T21:15:58.195+0100] GC(8) Pause Full (System.gc()) 82M->11M(1024M) 11.992ms
[2020-01-31T22:15:58.204+0100] GC(9) Pause Full (System.gc()) 81M->11M(1024M) 9.231ms
[2020-01-31T23:15:58.215+0100] GC(10) Pause Full (System.gc()) 81M->11M(1024M) 10.501ms

  [GC11]
DATETIME_CONFIG = 
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
disabled = false
pulldown_type = true
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N+0100]
TIME_PREFIX = ^\[
MAX_TIMESTAMP_LOOKAHEAD = 100

If i send the data directly into GC11 it works, but if i send it as a clone it picks up the data as one big event and does not break it down into multiple lines.

alt text

Other information might be i take it in and clone it with this, however with the sun_jvm i am able to break it down into multiple lines, but not the for GC11. Any help would be great thanks 🙂

[G1]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 28
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
category = Custom
pulldown_type = 1
disabled = false
Tags (1)
0 Karma

woodcock
Esteemed Legend

Your TIME_FORMAT is wrong, for one thing; it should be this:

LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
disabled = false
TIME_PREFIX = ^\[
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD = 28

But that's not your real problem. If it truly isn't work than EITHER you don't have newlines where you think that you do OR you have deployed it wrong. If you are doing a sourcetype override/overwrite, you must use the ORIGINAL value, NOT the new value. You must deploy your settings to the first full instance(s) of Splunk that handle the events (usually either the HF tier if you use one, or else your Indexer tier) UNLESS you are using HEC's JSON endpoint (it gets pre-cooked) or INDEXED_EXTRACTIONS (configs go on the UF in that case), then restart all Splunk instances there. When (re)evaluating, you must send in new events (old events will stay broken), then test using _index_earliest=-5m to be absolutely certain that you are only examining the newly indexed events.

0 Karma

jadengoho
Builder

Hi @woodcock 

I would like to ask similar question , what do you mean by "If you are doing a sourcetype override/overwrite, you must use the ORIGINAL value, NOT the new value".

I encounter the similar issue and still no solution found. 

If i try my time recognition ( e.g: TIME_PREFIX, TIME_FORMAT ) parameters on the ORIGINAL sourcetype it works.

But if i try it to the CLONED one - its  not working.

And how can i apply it properly on the CLONED sourcetype, cause the ORIGINAL sourcetype has other logs to distribute ? 

0 Karma

robertlynch2020
Motivator

Thanks for the replay (as always :)).

So i updated the time but still not luck.

I am sending the data in form a forwarder to one splunk install (There are no other installs in play here).

[monitor:///net/hp737srv/hp737srv1/apps/TEST/JAVA_11_TEST_FILES/ALL_3_JAVA_FILES.../*]
disabled = true
host = JAVA_11_TEST56
index = mlc_live
whitelist = .*.gc.*.log$|gc_.*\.log$|GC_.*\.log$
sourcetype = G1
crcSalt = <SOURCE>

If i take one file and send it directly to GC11 it works. I just check there and it does have CRLF at the end of each line.

[GC11]
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
disabled = false
TIME_PREFIX = ^\[
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD = 100

Not working (When i use TRANSFORMS-CLONE, but ok directly into GC11 sourcetype)
[2020-01-31T15:15:54.526+0100] Using G1
[2020-01-31T15:15:56.029+0100] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 62M->4M(1024M) 8.384ms
[2020-01-31T15:15:58.104+0100] GC(1) Pause Young (Concurrent Start) (Metadata GC Threshold) 283M->12M(1024M) 17.161ms
[2020-01-31T15:15:58.104+0100] GC(2) Concurrent Cycle
[2020-01-31T15:15:58.109+0100] GC(2) Pause Remark 12M->12M(1024M) 1.751ms
[2020-01-31T15:15:58.109+0100] GC(2) Pause Cleanup 12M->12M(1024M) 0.156ms

As a test , i took the data and i removed the [] to see if it would work and it did.
working direct from forwarder and using TRANSFORMS-CLONE
2020-01-31T15:15:54.526+0100 Using G1
2020-01-31T15:15:56.029+0100 GC(0) Pause Young (Normal) (G1 Evacuation Pause) 62M->4M(1024M) 8.384ms
2020-01-31T15:15:58.104+0100 GC(1) Pause Young (Concurrent Start) (Metadata GC Threshold) 283M->12M(1024M) 17.161ms
2020-01-31T15:15:58.104+0100 GC(2) Concurrent Cycle
2020-01-31T15:15:58.109+0100 GC(2) Pause Remark 12M->12M(1024M) 1.751ms
2020-01-31T15:15:58.109+0100 GC(2) Pause Cleanup 12M->12M(1024M) 0.156ms

I am unsure what you meant by "you must use the ORIGINAL value, NOT the new value" is the ORGINAL the _raw data not the data after the first sourcetype might change it?

The first sourcetype is below, i take the file in blocks as i need to. The idea is take it in G1 and then sun_jvm and GC11 both need single line.
As i have said sun_jvm works just fine, the timestamp is the same but with no [].

[G1]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 28
DATETIME_CONFIG = 
NO_BINARY_CHECK = true
category = Custom
pulldown_type = 1
disabled = false
TRANSFORMS-CLONE = CLONE_SOURCETYPE_JAVA,CLONE_SOURCETYPE_JAVA1

As always any help would be great as i am starting to think there might be a bug in Splunk (i hope not!)

Cheers
Robbie

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...