Deployment Architecture
Highlighted

Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Explorer

Hi

I am using Hunk on Amazon EMR. My source data is in 3 different folders in an S3 bucket. Can I tell Hunk to combine these 3 sources into a single Virtual Index? If not, whats the best way round this issue?

Thanks

Nick

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Splunk Employee
Splunk Employee

Yes, a single virtual index can point to multiple input folders in an S3 bucket. You can do this in the UI by adding additional settings for each additional folder. For example:

vix.input.2.path = s3n://mybucket/folder2/...
vix.input.3.path = s3n://mybucket/folder3/...
...

vix.input.[unique-number].path = [path-to-s3-folder]/...

The "..." at the end just signifies that we want to look recursively into that folder.

View solution in original post

Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Explorer

I have tried this and is not working. Its only returning results from the west-2 bucket

[elb]
vix.input.1.path = s3n://bucket-us-west2-elb-logs/...
vix.fs.s3n.endpoint = s3-us-west-2.amazonaws.com
vix.fs.s3n.2.endpoint = s3-external-1.amazonaws.com
vix.input.2.path = s3n://bucket-us-east1-elb-logs/...

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Splunk Employee
Splunk Employee

Since you're trying to access buckets from different regions can you try the following:

At the provider level, set
[provider:abc]
...
vix.fs.s3n.endpoint = s3.amazonaws.com

[elb]
vix.provider = abc
vix.input.1.path = s3n://bucket-us-west2-elb-logs/...
vix.input.2.path = s3n://bucket-us-east1-elb-logs/...
...

Note that accessing buckets cross region has performance and cost implications

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Explorer

Thanks for the response.
I tried that it still only returns data from one region.
My use case is pulling elb access logs from each region and visualize it. I have tried creating separate virtual index and point the provider to elb. Its still not working.

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Explorer

Also s3 endpoints for AWS are different for regions.

US Standard * us-east-1 s3.amazonaws.com (N. Virginia or Pacific Northwest) or s3-external-1.amazonaws.com (N. Virginia only)

US West (Oregon) us-west-2 s3-us-west-2.amazonaws.com

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Splunk Employee
Splunk Employee

What happens if the vix points only to the east bucket? Also are you able to access the east bucket using the hadoop cli - hadoop fs -ls s3n://bucket-us-east1-elb-logs/ ?

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Explorer

Yes if vix points to only east it works. Only one bucket is working at the same time.

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Splunk Employee
Splunk Employee

A couple of questions:
Are you getting any errors in the UI or search.log?
Also, what search are you trying to run?
What version of Hunk are you using? Are you using EMR + Hunk hourly?

(I'm a bit puzzled as I tried this with vix.input.1/2.path pointing to buckets in different regions and searches worked ok)

0 Karma
Highlighted

Re: Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

Splunk Employee
Splunk Employee

Have you tried:
Provider 1 -> VIX 1
Provider 2 -> VIX 2
Provider 3 -> VIX 3
Then in the search you you index=VIX*

0 Karma