All Apps and Add-ons

Why is the Splunk App for Stream not extracting the correct _time field in my distributed search environment?

Splunk Employee
Splunk Employee

Hello,

I have a universal forwarder and an indexer, and deployed the Splunk App for Stream app as documented.
My problem is about extracting _time.

From the Splunk_TA_stream/default/props.conf file of the indexer, I could find the following about the _time extraction.

 [source::stream:*]
 KV_MODE = json
 NO_BINARY_CHECK = 1
 SHOULD_LINEMERGE=false
 # stream events use "timestamp" for the start of the event, and
 # "endtime" for the end of the event; the latter is used for _time
 TIME_PREFIX=\"endtime\":\"

It is configured to extract _time value from the "endtime" tag of json, but as you can see from the attached image, the extracted _time has different values. Because I'm in GMT+9, the difference between endtime and _time should be 9 hours exactly, but the seconds or minutes values are different. It looks to use 'current time'. I tried to change several configurations including MAX_TIMESTAMP_LOOKAHEAD, but not successful.

The _time was extracted correctly in a single node environment. This problem only seems to happen in a distributed env.
Is there anything that I need to check?

alt text

0 Karma

Splunk Employee
Splunk Employee

( My comment must be too long to be attached as a comment to your comment. )

Because I don't have a separate search head, I installed Stream app on an indexer.
I have 1 UF and 1 IDX on my mac book. This problem was found from a customer site (an universal forwarder and an indexer instances are installed separately on 2 Linux 64bit machines for each.) and I reproduced the same problem on my mac book.

Regarding the locale config, my mac book has followings...
Because I used same machine for single instance environment (indexer only), the locale was same.
alt text

And the raw stream:tcp events from universal forwarder look like followings....
alt text

I also tested with "MAX_TIMESTAMP_LOOKAHEAD=0" but not helpful.

With the query you said, I got followings. timestartpos and timeendpos fields were not accessible.

_time endtime timestamp raw_timestamp timestartpos timeendpos
2015-11-09 09:56:36 2015-11-09T00:53:20.875968Z 2015-11-09T00:52:34.964881Z

2015-11-09 09:56:36 2015-11-09T00:52:35.397372Z 2015-11-09T00:52:34.968693Z

2015-11-09 09:56:36 2015-11-09T00:53:21.533349Z 2015-11-09T00:52:38.447201Z

2015-11-09 09:56:36 2015-11-09T00:53:21.533539Z 2015-11-09T00:52:36.362271Z

2015-11-09 09:56:36 2015-11-09T00:53:21.533048Z 2015-11-09T00:52:35.574232Z

2015-11-09 09:56:25 2015-11-09T00:53:13.186080Z 2015-11-09T00:52:41.646930Z

2015-11-09 09:56:15 2015-11-09T00:52:56.839069Z 2015-11-09T00:52:56.472293Z

2015-11-09 09:56:04 2015-11-09T00:52:45.785751Z 2015-11-09T00:52:44.707164Z

2015-11-09 09:56:04 2015-11-09T00:52:48.802784Z 2015-11-09T00:52:37.999926Z

2015-11-09 09:56:04 2015-11-09T00:52:48.800284Z 2015-11-09T00:52:37.999408Z

0 Karma

Splunk Employee
Splunk Employee

Seems like timestamp detection doesn't work on your instance. Can you try adding the following to the props.conf in the same [source::stream:*] stanza:

TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6N%z
0 Karma

Splunk Employee
Splunk Employee

The TIME_FORMAT seems working for events collected by local indexer nodes. But not working for the data which was collected from universal forwarder and all DATE_* fields are not parsed correctly.

0 Karma

Splunk Employee
Splunk Employee

Seems like there's something wrong with UF configuration.. I think it's probably worth opening a support case for this issue as I'm running out of ideas how to fix it without looking at the UF config, etc.

0 Karma

Splunk Employee
Splunk Employee

hi kwchang,

Do you have Splunk_TA_stream installed on all indexers? I believe if your stream events are being forwarded by UF (as opposed to heavy forwarder), the timestamp extraction is performed on the indexer, so the indexer must have stream's props.conf stanza in order to extract the timestamp correctly.

0 Karma

Splunk Employee
Splunk Employee

..forgot to mention that the Stream modular input on indexers should (obviously) be disabled

0 Karma

Splunk Employee
Splunk Employee

Thank you for your answer.

Of course, I have Splunk_TA_stream and splunk_app_stream on my indexer.
I tried to make some changes in the [source::stream:*] stanza in etc/apps/Splunk_TA_stream/default/props.conf in indexer, but it didn't affect to the result.

The stream modular input on indexer side was already disabled. Is this mandatory?

0 Karma

Splunk Employee
Splunk Employee

Hmm, that's interesting.

No, disabling the Stream modular input on indexer is not mandatory. Also, you don't need the splunk_app_stream on an indexer, unless it's also a search head.

What's the locale of the indexers in the new environment vs. the one with "working" timestamps? (I'm not sure splunkd takes into an account, but I guess it's worth checking).

How does the raw stream:tcp event look like? Does it have endtime field in the beginning of the event? Stream used to have a timestamp parsing issue caused by endtime field not being within the MAX_TIMESTAMP_LOOKAHEAD interval, but it was fixed in 6.3 or one of the 6.3.x maintenance releases.

What does the following query outputs on your instance?

sourcetype="stream:tcp" | eval raw_timestamp=substr(_raw, timestartpos+1, timeendpos-timestartpos) | table _time, endtime, timestamp, raw_timestamp, timestartpos, timeendpos

0 Karma

Splunk Employee
Splunk Employee

I don't have separate search head, so I installed Stream App on indexer.
I have Universal Forwarder and Indexer on a same box, my mac book. So the locale is the same.

kwchang@MacBook: ~ $ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
kwchang@MacBook: ~ $ sudo systemsetup -gettimezone
Password:
Time Zone: Asia/Seoul

And the raw events of stream:tcp from universal forwarder look like followings.

{"endtime":"2015-11-09T00:40:08.061276Z","timestamp":"2015-11-09T00:39:55.449199Z","src_ip":"172.20.10.2","src_mac":"1A:F6:43:34:EF:AA","src_port":49977,"connection":"54.251.115.138:443","client_rtt":140,"client_rtt_packets":1,"client_rtt_sum":140,"ack_packets_in":3,"bytes_in":276,"data_packets_in":0,"duplicate_packets_in":0,"missing_packets_in":0,"packets_in":4,"app":"unknown","server_rtt":0,"server_rtt_packets":0,"server_rtt_sum":0,"dest_ip":"54.251.115.138","dest_mac":"3A:F6:43:43:54:64","dest_port":443,"ack_packets_out":2,"bytes_out":140,"data_packets_out":0,"duplicate_packets_out":0,"missing_packets_out":0,"packets_out":2,"tcp_status":0,"time_taken":12612217}

{"endtime":"2015-11-09T00:40:08.051195Z","timestamp":"2015-11-09T00:39:57.078105Z","src_ip":"172.20.10.2","src_mac":"1A:F6:43:34:EF:AA","src_port":50008,"connection":"184.73.197.150:443","client_rtt":140,"client_rtt_packets":1,"client_rtt_sum":140,"ack_packets_in":3,"bytes_in":276,"data_packets_in":0,"duplicate_packets_in":0,"missing_packets_in":0,"packets_in":4,"app":"unknown","server_rtt":0,"server_rtt_packets":0,"server_rtt_sum":0,"dest_ip":"184.73.197.150","dest_mac":"3A:F6:43:43:54:64","dest_port":443,"ack_packets_out":2,"bytes_out":140,"data_packets_out":0,"duplicate_packets_out":0,"missing_packets_out":0,"packets_out":2,"tcp_status":0,"time_taken":10973230}

Sometimes, there seems to have issues in event breaking. Following data was captured as a single event.
{"timestamp":"2015-11-09T09:12:114Z","endtime":"2015-11-09T09:14:514Z","src_ip":"95.149.230.218","dest_ip":"55.2.98.115","src_port":"36998","dest_port":"10043","src_mac":"00:50:56:92:4E:26","dest_mac":"00:1B:17:00:01:30","packets_in":13,"packets_out":13,"ack_packets_in":6,"ack_packets_out":7,"missing_packets_in":0,"missing_packets_out":0,"duplicate_packets_in":0,"duplicate_packets_out":0,"data_packets_in":6,"data_packets_out":6,"bytes_in":525,"bytes_out":4137,"time_taken":65596,"ssl_version":"3.0","ssl_session_id":"bd608869f0c629767ea7e3ebf7a63bdcffb0ef58b1b941e6b0c044acb6820a77","ssl_cert_md5":"AE435972651BFE75B5E8547BF7A35AC2","ssl_subject":"OU=Domain Control Validated,CN=*.dropbox.com","ssl_issuer":"C=US,O=\"GeoTrust, Inc.\",CN=RapidSSL CA","ssl_signature_algorithm":"sha1WithRSAEncryption","ssl_publickey_algorithm":"rsaEncryption","ssl_serialnumber":"25196489825037653343","ssl_validity_end":"Apr 10 21:21:59 2016 GMT","ssl_validity_start":"Apr 10 21:21:59 2014 GMT","client_rtt":62,"server_rtt":25593,"client_rtt_sum":186,"server_rtt_sum":25593,"client_rtt_packets":3,"server_rtt_packets":1,"connection":"54.200.187.224:443","tcp_status":0,"app":"ssl"} {"timestamp":"2015-11-09T09:12:374Z","endtime":"2015-11-09T09:14:344Z","src_ip":"82.147.19.83","dest_ip":"184.50.214.228","src_port":"48497","dest_port":"10043","src_mac":"00:50:56:92:4E:26","dest_mac":"00:1B:17:00:01:30","packets_in":13,"packets_out":13,"ack_packets_in":6,"ack_packets_out":7,"missing_packets_in":0,"missing_packets_out":0,"duplicate_packets_in":0,"duplicate_packets_out":0,"data_packets_in":6,"data_packets_out":6,"bytes_in":525,"bytes_out":4137,"time_taken":65596,"ssl_version":"3.0","ssl_session_id":"bd608869f0c629767ea7e3ebf7a63bdcffb0ef58b1b941e6b0c044acb6820a77","ssl_cert_md5":"AE0E54D5E06C387EE6725908B44368E8","ssl_subject":"CN=sc-db-centos2.sv.splunk.com,O=SplunkUser","ssl_issuer":"C=US,O=DigiCert Inc,OU=www.digicert.com,CN=DigiCert High Assurance EV Root CA","ssl_signature_algorithm":"sha1WithRSAEncryption","ssl_publickey_algorithm":"rsaEncryption","ssl_serialnumber":"11712524697123285136","ssl_validity_end":"May 10 21:21:59 2014 GMT","ssl_validity_start":"Apr 10 21:21:59 2014 GMT","client_rtt":62,"server_rtt":25593,"client_rtt_sum":186,"server_rtt_sum":25593,"client_rtt_packets":3,"server_rtt_packets":1,"connection":"54.200.187.224:443","tcp_status":0,"app":"ssl"} {"timestamp":"2015-11-09T09:11:174Z","endtime":"2015-11-09T09:12:214Z","src_ip":"70.146.92.251","dest_ip":"183.20.127.198","src_port":"18737","dest_port":"8443","src_mac":"00:50:56:92:4E:26","dest_mac":"00:1B:17:00:01:30","packets_in":13,"packets_out":13,"ack_packets_in":6,"ack_packets_out":7,"missing_packets_in":0,"missing_packets_out":0,"duplicate_packets_in":0,"duplicate_packets_out":0,"data_packets_in":6,"data_packets_out":6,"bytes_in":525,"bytes_out":4137,"time_taken":65596,"ssl_version":"3.0","ssl_session_id":"bd608869f0c629767ea7e3ebf7a63bdcffb0ef58b1b941e6b0c044acb6820a77","ssl_cert_md5":"7FEB15E4180D15E0196F298554DAEC6F","ssl_subject":"C=US,postalCode=20814,ST=Maryland,L=Bethesda,STREET=Suite 205,STREET=8120 Woodmont Ave,O=The SANS Institute,OU=Network Operations Center (NOC),OU=Unified Communications,CN=isc.sans.org","ssl_issuer":"C=US,ST=Arizona,L=Scottsdale,O=\"GoDaddy.com, Inc.\",CN=Go Daddy Root Certificate Authority - G2","ssl_signature_algorithm":"sha1WithRSAEncryption","ssl_publickey_algorithm":"rsaEncryption","ssl_serialnumber":"40564819207326872660","ssl_validity_end":"Jun 10 21:21:59 2014 GMT","ssl_validity_start":"Apr 10 21:21:59 2014 GMT","client_rtt":62,"server_rtt":25593,"client_rtt_sum":186,"server_rtt_sum":25593,"client_rtt_packets":3,"server_rtt_packets":1,"connection":"54.200.187.224:443","tcp_status":0,"app":"ssl"}

I also changed MAX_TIMESTAMP_LOOKAHEAD=0 but not helpful.

With the search you said, the timeendpos and timestartpos fields were not captured and I got followings.

sourcetype="stream:tcp" | eval raw_timestamp=substr(_raw, timestartpos+1, timeendpos-timestartpos) | table _time, endtime, timestamp, raw_timestamp, timestartpos, timeendpos

_time endtime timestamp
2015-11-09 09:55:43 2015-11-09T00:52:30.039647Z 2015-11-09T00:40:14.815575Z
2015-11-09 09:55:43 2015-11-09T00:52:28.041338Z 2015-11-09T00:44:07.828987Z
2015-11-09 09:55:43 2015-11-09T00:52:28.317332Z 2015-11-09T00:44:25.948623Z
2015-11-09 09:55:32 2015-11-09T00:51:37.838308Z 2015-11-09T00:45:41.403840Z

Thank you.

0 Karma