Deployment Architecture

Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

NickCorbettAt
Explorer

Hi

I am using Hunk on Amazon EMR. My source data is in 3 different folders in an S3 bucket. Can I tell Hunk to combine these 3 sources into a single Virtual Index? If not, whats the best way round this issue?

Thanks

Nick

0 Karma
1 Solution

elin
Splunk Employee
Splunk Employee

Yes, a single virtual index can point to multiple input folders in an S3 bucket. You can do this in the UI by adding additional settings for each additional folder. For example:

vix.input.2.path = s3n://mybucket/folder2/...
vix.input.3.path = s3n://mybucket/folder3/...
...

vix.input.[unique-number].path = [path-to-s3-folder]/...

The "..." at the end just signifies that we want to look recursively into that folder.

View solution in original post

rdagan_splunk
Splunk Employee
Splunk Employee

Have you tried:
Provider 1 -> VIX 1
Provider 2 -> VIX 2
Provider 3 -> VIX 3
Then in the search you you index=VIX*

0 Karma

tpanicker
Explorer

Search works fine. I am using Hunk APP for AWS ELB and its not working.

0 Karma

elin
Splunk Employee
Splunk Employee

Yes, a single virtual index can point to multiple input folders in an S3 bucket. You can do this in the UI by adding additional settings for each additional folder. For example:

vix.input.2.path = s3n://mybucket/folder2/...
vix.input.3.path = s3n://mybucket/folder3/...
...

vix.input.[unique-number].path = [path-to-s3-folder]/...

The "..." at the end just signifies that we want to look recursively into that folder.

tpanicker
Explorer

I have tried this and is not working. Its only returning results from the west-2 bucket

[elb]
vix.input.1.path = s3n://bucket-us-west2-elb-logs/...
vix.fs.s3n.endpoint = s3-us-west-2.amazonaws.com
vix.fs.s3n.2.endpoint = s3-external-1.amazonaws.com
vix.input.2.path = s3n://bucket-us-east1-elb-logs/...

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Since you're trying to access buckets from different regions can you try the following:

At the provider level, set
[provider:abc]
...
vix.fs.s3n.endpoint = s3.amazonaws.com

[elb]
vix.provider = abc
vix.input.1.path = s3n://bucket-us-west2-elb-logs/...
vix.input.2.path = s3n://bucket-us-east1-elb-logs/...
...

Note that accessing buckets cross region has performance and cost implications

0 Karma

tpanicker
Explorer

Thanks for the response.
I tried that it still only returns data from one region.
My use case is pulling elb access logs from each region and visualize it. I have tried creating separate virtual index and point the provider to elb. Its still not working.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

What happens if the vix points only to the east bucket? Also are you able to access the east bucket using the hadoop cli - hadoop fs -ls s3n://bucket-us-east1-elb-logs/ ?

0 Karma

tpanicker
Explorer

Yes if vix points to only east it works. Only one bucket is working at the same time.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

A couple of questions:
Are you getting any errors in the UI or search.log?
Also, what search are you trying to run?
What version of Hunk are you using? Are you using EMR + Hunk hourly?

(I'm a bit puzzled as I tried this with vix.input.1/2.path pointing to buckets in different regions and searches worked ok)

0 Karma

tpanicker
Explorer

Also s3 endpoints for AWS are different for regions.

US Standard * us-east-1 s3.amazonaws.com (N. Virginia or Pacific Northwest) or s3-external-1.amazonaws.com (N. Virginia only)

US West (Oregon) us-west-2 s3-us-west-2.amazonaws.com

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...