Deployment Architecture

opinion needed for forwarder management

s0rbeto
Explorer

we just deployed splunk into our enterprise environment. We have 3000 clients and all have UF installed with simple built-in apps "Splunk_TA_windows" and "Splunk_TA_Linux".
Now we are pushing logs/data from tier 1 applications (mission critical applications), about 4 millions logs everyday and we have 1TB license per/day
Our current challenge is, to differentiate our data based on applications and other informations.
Currently, our data(s) are indexed with "index", "host", "sourcetype", but then we realized we need to be more specific on our data(s), then it comes to two approaches.

1). add more fields, but then we need more license
160 bytes * 40 billions data = 6.4 Terabytes

2) utilize clientName, but then we need to have script pushed to each machine and edit deploymentclient.conf to change the "clientName" field. Then we need to build a lookup table ... At this point we don't know if clientName is even searchable in "Searching and Reporting" app.

What you guys think? Please feel free to drop any comments or suggestion.

Thank you guys!

0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi s0rbeto,

if you only add those fields as additional index time fields (not recommended btw http://docs.splunk.com/Documentation/Splunk/6.3.0/Data/Configureindex-timefieldextraction ) or search time fields and not into the source log files, it will not need more license. Because the license is based on the amount of _raw data passed into Splunk.
So, if you want to add some field extractions based on the existing log source take a look at the docs http://docs.splunk.com/Documentation/Splunk/6.3.0/Knowledge/Createandmaintainsearch-timefieldextract... to learn more about search time field extraction.

After reading and learning about field extractions I don't think you need the second option....

Hope this helps ...

cheers, MuS

View solution in original post

woodcock
Esteemed Legend

OK, I am doing exactly this for a client using a nightly extract from CMDB. This DB already contains fields like status, environment, etc. We just schedule an dbquery and save this to a lookup file with outputcsv and whenever we need to, we use the lookup file to augment our dataset.

0 Karma

woodcock
Esteemed Legend

Why would you need more fields to create differentiation? Can you not use tags and eventtypes for this, combined with your site-specific knowledge of what each host "is"? Do you not have CMDB that you can query to create a lookup to help you differentiate hosts? What kind of information are you thinking that you need to add to each event?

0 Karma

s0rbeto
Explorer

We have tags and eventtypes, but at some point we need to extract specific data in easier way, that is why we need additional fields.
Yes we do have CMDB, are you talking about the second option? i don't know what cmdb can integrate with splunk.
We are adding additional fields like vlanid, ip address, datacenter_location and etcs

0 Karma

woodcock
Esteemed Legend

You would do well to back up and FIRST explain your problem as clearly and completely as you can (without confusing the issue by discussing any kind of a solution). The problem is that you are too deep into your preferred solution for anybody else to understand what the real problem is. What is the real problem? What is it that is "no longer easy"? What is it that you really need (and don't say "more fields")? You need some way to do exactly what?

0 Karma

s0rbeto
Explorer

We have 3000 clients and all forwarders are pushing data to splunk, currently they are indexed with "index" ,"sourcetype", "host" and nothing more. What we like to have is, we want to look up those data by additional fields, ex "environment : Production/nonprod" , "vlanid", "location", "applications".
The challenge to this is the license, and we aren't sure if adding more fields like i mentioned above would increase the data size that are pushing into splunk.
We are open to all solutions, we haven't implement anything yet
What do you think?
Thanks!

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi s0rbeto,

if you only add those fields as additional index time fields (not recommended btw http://docs.splunk.com/Documentation/Splunk/6.3.0/Data/Configureindex-timefieldextraction ) or search time fields and not into the source log files, it will not need more license. Because the license is based on the amount of _raw data passed into Splunk.
So, if you want to add some field extractions based on the existing log source take a look at the docs http://docs.splunk.com/Documentation/Splunk/6.3.0/Knowledge/Createandmaintainsearch-timefieldextract... to learn more about search time field extraction.

After reading and learning about field extractions I don't think you need the second option....

Hope this helps ...

cheers, MuS

s0rbeto
Explorer

Thank you MuS,
yeah we were thinking that too, but we would need to bring in more CPU power to do the indexing
Thanks for the infos

0 Karma
Get Updates on the Splunk Community!

Ready, Set, SOAR: How Utility Apps Can Up Level Your Playbooks!

 WATCH NOW Powering your capabilities has never been so easy with ready-made Splunk® SOAR Utility Apps. Parse ...

DevSecOps: Why You Should Care and How To Get Started

 WATCH NOW In this Tech Talk we will talk about what people mean by DevSecOps and deep dive into the different ...

Introducing Ingest Actions: Filter, Mask, Route, Repeat

WATCH NOW Ingest Actions (IA) is the best new way to easily filter, mask and route your data in Splunk® ...