Deployment Architecture

[SmartStore] on Google Cloud Storage

Path Finder

Disclaimer-- I realize Google's Cloud Storage isn't officially support, but it is S3 compliant, so it's being tested.

I'm running into the following issue with that I'm not able to determine the root cause for with running S2 on GCP cloud buckets.

03-13-2019 09:39:07.194 -0500 ERROR S3Client - command=multipart-upload command=begin transactionId=0x2afb6da35000 rTxnId=0x2afb6da42000 status=completed success=N uri=https://storage.googleapis.com/<CLOUDBUCKET>/<INDEXNAME>/db/bb/e7/94~298A356E-F9F7-4713-B865-B9C7F7926ECB/guidSplunk-97A1AC40-F222-4E18-A4BF-86EFF89E8EA9/1552404180-1552329547-2210723404812320681.tsidx statusCode=400 statusDescription="Bad Request" payload="<?xml version='1.0' encoding='UTF-8'?><Error><Code>InvalidArgument</Code><Message>Invalid argument.</Message><Details>POST object expects Content-Type multipart/form-data</Details></Error>"

Looking at the indexes.conf spec, there is a config available that can be set for specific headers.

remote.s3.header.POST.Content-Type = "multipart/form-data”

That did not help though. In fact, it caused and immediate crash of the indexers when I applied the bundle.

So my question is-- what are the some other facets I can look at to understand the underlying problem here? It seems as those when it comes time to roll the bucket to Warm, it triggers the upload to remote storage, fails.

0 Karma
1 Solution

Path Finder

Solved!

So, as noted in previous comments, the issue here comes down to the difference between how AWS and Google handle multipart uploads. The details of this can be found in the links referenced earlier.

The solution (or workaround depending on how one wants to view it) is to set the following:

# in indexes.conf
[volume:remote_store]
# ... remote config values ...
remote.s3.multipart_download.part_size = 0
remote.s3.multipart_upload.part_size = 2147483648  #2GB, or some value less than 5GB, the GCS limit

This ensures that the S3Client will not attempt a multipart upload for objects smaller than the stated size. With maxDataSize set to auto, the default is 750(ish)MB and therefore none of the large objects, like tsidx files, will be uploaded as multipart.

View solution in original post

0 Karma

Path Finder

Solved!

So, as noted in previous comments, the issue here comes down to the difference between how AWS and Google handle multipart uploads. The details of this can be found in the links referenced earlier.

The solution (or workaround depending on how one wants to view it) is to set the following:

# in indexes.conf
[volume:remote_store]
# ... remote config values ...
remote.s3.multipart_download.part_size = 0
remote.s3.multipart_upload.part_size = 2147483648  #2GB, or some value less than 5GB, the GCS limit

This ensures that the S3Client will not attempt a multipart upload for objects smaller than the stated size. With maxDataSize set to auto, the default is 750(ish)MB and therefore none of the large objects, like tsidx files, will be uploaded as multipart.

View solution in original post

0 Karma

Path Finder

I'm going to take my digging as the answer here.

In 7.2.3 and 7.2.4.2, I'm finding that this may be due to a difference in how Google implements the S3 API. So currently, GCS is not a viable option for S2.

For reference--

This includes testing various configurations like:
- remote.s3.headers.POST.Content-Type //to resolve multipart-upload error
- remote.s3.use_delimiter //to see if the guidSplunk delimiter was the issue
- use_batch_remote_rep_changes //see if it was related to a race condition with the CM making calls to the peers

0 Karma

Ultra Champion

Just looking at that error again, I’m not sure it’s a missing header it’s complaining about so much as the actual content of the post data.

I wonder if Splunk is not sending “multi/form” yet google is expecting it.

Not sure how s2 posts, but maybe you need to send a header to tell google to expect something other than multi/form??

Just a guess.

0 Karma

Path Finder

I was thinking something similar.

Reading through the Google documentation here:
https://cloud.google.com/storage/docs/json_api/v1/how-tos/multipart-upload
It talks about "multipart/related", not "multipart/form-data"

Digging further, I'm finding that this may be due to a difference in how Google implements the S3 API.

https://www.zenko.io/blog/four-differences-google-amazon-s3-api/
and referenced here,
https://github.com/kahing/goofys/issues/259#issuecomment-355713879

This is because GCS's S3 implementation does not support S3 multipart uploads and instead reinvented another API (ugh). Coincidentally I've started working on a fix for this so stay tuned.

0 Karma

Ultra Champion

Having looked at those links, I agree.

Sadly, I think you have your answer about GCS.
(for now)

0 Karma

Ultra Champion

What Splunk version are you running?

It may not be related to your issue, but you should be on 7.2.4.2.
Earlier versions did not support DMA on S2, and there is a hotfix in the latest .2 release specifically for S2

0 Karma

Path Finder

No dice. On version 7.2.4.2 the above error regarding a POST expecting multipart-upload persists.

Also, the use of the remote.s3.header.x.x config triggers and immediate crash still. The crash log shows it's cachemanager that's choking on it.

0 Karma

Path Finder

We're currently running 7.2.3.

I know there are some dashboards added to the MC in 7.2.4, but we're not quite there yet.

Also, the indexes that we're testing do not use DMA. That was part of our test prep.

0 Karma

Ultra Champion

Splunk Enterprise 7.2.4.2
This release address an issue that might impact data durability under certain rare cluster conditions. The issue is triggered when there is a confluence of data replication errors from index clustering as well as an upload to the Splunk object store medium (SmartStore) via secondary or tertiary replication nodes.

While the incidence of the condition is rare and the impact is negligible, we recommend that customers that are currently using SmartStore in a clustered production environment upgrade to version 7.2.4.2 and set max_replication_errors in server.conf to 20.

0 Karma

Path Finder

This could be it since the error is related to the multipart-upload command. It would've been nice if the release notes could've been a bit more specific with that it was resolving.

I'm going to try the upgrade path in a test environment and see if that resolves that error. I'll report part my findings.

0 Karma

Ultra Champion

Posted for Info ^ Sadly does not appear to refer to your issue.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!