Deployment Architecture

Can I tell Hunk to combine 3 different folders in an S3 bucket into a single Virtual Index?

NickCorbettAt
Explorer

Hi

I am using Hunk on Amazon EMR. My source data is in 3 different folders in an S3 bucket. Can I tell Hunk to combine these 3 sources into a single Virtual Index? If not, whats the best way round this issue?

Thanks

Nick

0 Karma
1 Solution

elin
Splunk Employee
Splunk Employee

Yes, a single virtual index can point to multiple input folders in an S3 bucket. You can do this in the UI by adding additional settings for each additional folder. For example:

vix.input.2.path = s3n://mybucket/folder2/...
vix.input.3.path = s3n://mybucket/folder3/...
...

vix.input.[unique-number].path = [path-to-s3-folder]/...

The "..." at the end just signifies that we want to look recursively into that folder.

View solution in original post

rdagan_splunk
Splunk Employee
Splunk Employee

Have you tried:
Provider 1 -> VIX 1
Provider 2 -> VIX 2
Provider 3 -> VIX 3
Then in the search you you index=VIX*

0 Karma

tpanicker
Explorer

Search works fine. I am using Hunk APP for AWS ELB and its not working.

0 Karma

elin
Splunk Employee
Splunk Employee

Yes, a single virtual index can point to multiple input folders in an S3 bucket. You can do this in the UI by adding additional settings for each additional folder. For example:

vix.input.2.path = s3n://mybucket/folder2/...
vix.input.3.path = s3n://mybucket/folder3/...
...

vix.input.[unique-number].path = [path-to-s3-folder]/...

The "..." at the end just signifies that we want to look recursively into that folder.

tpanicker
Explorer

I have tried this and is not working. Its only returning results from the west-2 bucket

[elb]
vix.input.1.path = s3n://bucket-us-west2-elb-logs/...
vix.fs.s3n.endpoint = s3-us-west-2.amazonaws.com
vix.fs.s3n.2.endpoint = s3-external-1.amazonaws.com
vix.input.2.path = s3n://bucket-us-east1-elb-logs/...

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Since you're trying to access buckets from different regions can you try the following:

At the provider level, set
[provider:abc]
...
vix.fs.s3n.endpoint = s3.amazonaws.com

[elb]
vix.provider = abc
vix.input.1.path = s3n://bucket-us-west2-elb-logs/...
vix.input.2.path = s3n://bucket-us-east1-elb-logs/...
...

Note that accessing buckets cross region has performance and cost implications

0 Karma

tpanicker
Explorer

Thanks for the response.
I tried that it still only returns data from one region.
My use case is pulling elb access logs from each region and visualize it. I have tried creating separate virtual index and point the provider to elb. Its still not working.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

What happens if the vix points only to the east bucket? Also are you able to access the east bucket using the hadoop cli - hadoop fs -ls s3n://bucket-us-east1-elb-logs/ ?

0 Karma

tpanicker
Explorer

Yes if vix points to only east it works. Only one bucket is working at the same time.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

A couple of questions:
Are you getting any errors in the UI or search.log?
Also, what search are you trying to run?
What version of Hunk are you using? Are you using EMR + Hunk hourly?

(I'm a bit puzzled as I tried this with vix.input.1/2.path pointing to buckets in different regions and searches worked ok)

0 Karma

tpanicker
Explorer

Also s3 endpoints for AWS are different for regions.

US Standard * us-east-1 s3.amazonaws.com (N. Virginia or Pacific Northwest) or s3-external-1.amazonaws.com (N. Virginia only)

US West (Oregon) us-west-2 s3-us-west-2.amazonaws.com

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

 Are you ready to revolutionize your IT operations? As digital transformation accelerates, the demand for ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...