I need some help getting me config right in pros.conf.
When the data comes I can see the _time is not set to the value passed for TimeStamp. It is set to the time the event was ingested. For legacy reasons we will have a queue between the app and logging hosts so there will be latency on the messages so setting it to the passed time is critical... The one thing I have noticed is that the decimal places on TimeStamp vary form 5 to 7 digits and we are using %9N. I tried setting this to %3N hoping it would just ignore the characters after but no joy. Do we need to make sure we fix the number of decimals in the logging code?
I do not see any data issues when checking DataQuality.
Any help is very much appreciated!
I am using the following query to evaluate the drift. I know it is not being set as I have cause a delay by sitting on a breakpoint in the logging code.
index=telemetry_*_event_*
| fields _time, TimeStamp, index
| fields - _raw
| eval epoch_time_span=strptime('TimeStamp',"%Y-%m-%dT%H:%M:%S.%9N")
| eval diff=epoch_time_span - _time
| table _time, TimeStamp, epoch_time_span, diff, index
| sort diff
Sample results:
_time TimeStamp epoch_time_span diff
2020-05-08T16:04:10.324-0600 2020-05-08T16:04:04.5663643Z 1588953845 -5.758
2020-05-08T16:01:19.641-0600 2020-05-08T16:01:19.5349868Z 1588953680 -0.106
2020-05-08T15:54:05.559-0600 2020-05-08T15:54:05.4668267Z 1588953245 -0.092
2020-05-08T15:54:17.723-0600 2020-05-08T15:54:17.715911Z 1588953258 -0.007
2020-05-08T16:01:31.924-0600 2020-05-08T16:01:31.9176148Z 1588953692 -0.006
2020-05-08T16:01:34.754-0600 2020-05-08T16:01:34.7519748Z 1588953695 -0.002
Here is the _raw data from splunk:
{"TimeStamp":"2020-05-08T16:04:20.6492094Z","Level":"Debug","Properties":{"Action":"XXXX","Channel":"XXXX","CorrelationID":"7c003283-a81e-4b11-97ff-c926e53f4fa6","Host":"XXXX","ServiceID":{"Application":{"Name":"XXXX"},"Environment":"development","Tenant":"XXXX"},"ProcessID":"22908","ProcessName":"XXXX","ThreadID":"18872","ThreadName":"XXXX","User":{"Domain":"XXXX","ID":"4","Name":"XXXX","Location":"XXXX","Custom":{"OrganizationCode":"XXXX"}},"Performance":{"DataCallCount":1,"ElapsedTime":122.0,"ElapsedTimeSpan":"0:00:00.122","HasError":false,"Outbound":false,"Ticks":1223834}},"Version":"0.0.1.19100"}
Here is the config:
[telemetry_source_type]
#Internal
pulldown_type = true
#Meta data
category = Structured
description = JSON based source Type for Telemetry events
disabled = false
#We can change these
ANNOTATE_PUNCT = false
BREAK_ONLY_BEFORE_DATE = false
DATETIME_CONFIG = NONE
KV_MODE = json
LINE_BREAKER = ([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 32
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%9N
TIME_PREFIX = "TimeStamp":"
TRUNCATE = 4194304
TZ = GMT
Thanks all for the inputs. I finally figured it out and it was not related to pros.conf. I got into the logging code and found we were not explicitly setting "time" on the HTTP request to the /services/collector/event because we called the wrong version of the Send() method in the Slunk Logging DLL. Face in Palm.
The "time" in the HTTP call appears to override whatever you have set up for a source type. So no matter what I did it used the value passed. Since we did not use the right Send() method the time defaulted to the transmission time; not the internal logged time. So that is why my debugger caused it to look like it was drifting.
Thanks again all!
Thanks all for the inputs. I finally figured it out and it was not related to pros.conf. I got into the logging code and found we were not explicitly setting "time" on the HTTP request to the /services/collector/event because we called the wrong version of the Send() method in the Slunk Logging DLL. Face in Palm.
The "time" in the HTTP call appears to override whatever you have set up for a source type. So no matter what I did it used the value passed. Since we did not use the right Send() method the time defaulted to the transmission time; not the internal logged time. So that is why my debugger caused it to look like it was drifting.
Thanks again all!
Hi @stirlec,
Thanks for asking a question! If you were able to resolve this issue, please click “Accept” directly below the answer to resolve the post.
[telemetry_source_type]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TRUNCATE=0
pulldown_type = true
category = Structured
description = JSON based source Type for Telemetry events
disabled = false
ANNOTATE_PUNCT = false
KV_MODE = json
INDEXED_EXTRACTION = none
The _time
was automatically determined by UTC.
Thank you for the response!
Unfortunately it did not work. I used the exact setting you recommended on my dev server and I still have differences in the _time and my passed TimeStamp.
Examples:
_time epoch_time TimeStamp epoch_time_span diff
2020-05-14T20:35:41.977-0600 1589488541.977000 2020-05-14T20:35:41.8870067Z 1589488541.887000 -0.09
2020-05-14T20:35:54.126-0600 1589488554.126000 2020-05-14T20:35:54.1202827Z 1589488554.120280 -0.006
TIME_FORMAT = %FT%T.%9QZ
Thanks for the feedback!
Still no joy. I am really at a loss. Maybe it is because we are using HEC? When I test drift I set a breakpoint in the application logging code to cause a delay between the app server timestamp and when it gets delivered. I cannot get them to match 100% no matter what I try. My most recent change was to fix the timestamp being generated to yyyy-MM-ddTH:mm:ss.fffK".
I even built a new source type for the HEC input using the data "Add Data" tool. It could see the correct timestamp for sure. I switched the HEC input to use the new source type... But when I ran more test load I still got drift. It generated this:
[test_json]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = json
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = TimeStamp
TIME_FORMAT = %FT%T.%9QZ
TZ = GMT
category = Structured
description = JavaScript Object Notation format. For more information, visit http://json.org/
disabled = false
pulldown_type = true
5/18/20
6:28:53.665 PM
{"TimeStamp":"2020-05-18T18:28:53.577Z","Level":"Information","Properties":{"Action":"Navitaire.Ncl.Encryption.RSAInitializerFactory.Initialize","Host":"NAVL79606","ServiceID":{"Application":{"Name":"New Skies 4.x.0.0 1L-development Component Host"},"Environment":"development","Tenant":"1L"},"ProcessID":"26960","ProcessName":"C:\Projects\New Skies\4.x\src\bin\Debug\Navitaire.Ncl.ComponentHost.exe","ThreadID":"500","ThreadName":"Navitaire.Ncl.ComponentHost","Error":{"Code":"SecureKey","Level":"Information","Message":"No key file path specified. Using application runtime directory:\r\nC:\Projects\New Skies\4.x\src\bin\Debug","Type":"RSAInitializerFactory"}},"Version":"0.0.1.19100"}
NOTE: We are using Splunk.Logging.TraceListener via NUGET to log. Could it be the way we have that configured?
Which HEC endpoints are you using /services/collector/event
OR /services/collector/raw
?