Splunk Search

Is it possible to narrow searches by grouping devices in "nesting" groups?

oliverj
Communicator

Hello.

I am investigating SPLUNK, and am trying to accomplish a task I was hoping would be simple:
I have a "group", lets call it Location1
Inside that group, I want to create a subgroup for "Systems owned by Comm"
Inside that group, I want a subgroup for "Switches", "Unix", "Routers", etc.

So, It seems that I can create 3 tags:
Location1
Location1_Comm
Location1_Comm_Unix

And I can add all 3 tags to my RedHat Comm machine.

This way, when a security person wants to check for failed logins on a specific unix group (in this case, everyone in Comm at Location1), the security person can search: "tag=Location1_Comm_Unix" AND "eventtype=failed_login"
But, if the security person wanted to step up a level and search all devices at Location1, he could just search for: "tag=Location1" AND "eventtype=failed_login"

This will accomplish what I want, but a previous tool I used allowed for "Nesting" groups.
Location1_Comm_Unix is a member of
Location1_Comm is a member of
Location1

That way, if I add a redhat box to my collection, I give it the tag "Location1_Comm_Unix" and it automatically shows up in "tag=Location1"
and "tag=location1_Comm" also.

Is this possible in SPLUNK?

0 Karma
1 Solution

woodcock
Esteemed Legend

I don't think you can do this with tags but you definitely can with eventtypes like this:

$SPLUNK_HOME/etc/apps/myapp/eventtypes.conf:

[Location1_Comm_Unix]
search = index=MyIndex host=L1CU1 OR host=L1CU2 OR host=L1CU3 OR host=*Loc1*Comm_Unix*
[Location1_Comm_Windows]
search = index=MyIndex host=L1CW1 OR host=L1CW2 OR host=L1CW3 OR host=*Loc1*Comm_Widows*
[Location2_Comm_Unix]
search = index=MyIndex host=L2CU1 OR host=L2CU2 OR host=L2CU3 OR host=*Loc2*Comm_Unix*
[Location2_Comm_Windows]
search = index=MyIndex host=L2CW1 OR host=L2CW2 OR host=L2CW3 OR host=*Loc2*Comm_Widows*

Then you can do searches like this:

eventtype=*Comm*
eventtype=Location1*
eventtype=*Unix

View solution in original post

0 Karma

woodcock
Esteemed Legend

I don't think you can do this with tags but you definitely can with eventtypes like this:

$SPLUNK_HOME/etc/apps/myapp/eventtypes.conf:

[Location1_Comm_Unix]
search = index=MyIndex host=L1CU1 OR host=L1CU2 OR host=L1CU3 OR host=*Loc1*Comm_Unix*
[Location1_Comm_Windows]
search = index=MyIndex host=L1CW1 OR host=L1CW2 OR host=L1CW3 OR host=*Loc1*Comm_Widows*
[Location2_Comm_Unix]
search = index=MyIndex host=L2CU1 OR host=L2CU2 OR host=L2CU3 OR host=*Loc2*Comm_Unix*
[Location2_Comm_Windows]
search = index=MyIndex host=L2CW1 OR host=L2CW2 OR host=L2CW3 OR host=*Loc2*Comm_Widows*

Then you can do searches like this:

eventtype=*Comm*
eventtype=Location1*
eventtype=*Unix
0 Karma

oliverj
Communicator

How can a single host have multiple hostnames?
I have a Cisco router that "Company B" named "Batman". I cannot remove that name, because it is relevant to "Company B" security searches.

Your example above seems to imply that I can say that Cisco router can be host=Batman AND host=Location1_CompanyB_Routers

0 Karma

woodcock
Esteemed Legend

You are thinking too tag-like. My example is showing that ALL events that have either host=Batman OR host=Location1_CompanyB_Routers will obtain the eventtype value. Any event that matches the search criteria will obtain the eventtype value and events can have more than one eventtype value (it is a mv field). That is why it works for your desired goal. It has the additional benefit that you can use wildcards with eventtypes but you cannot with tags, so your initial setup is easier and less ongoing maintenance is necessary.

0 Karma

oliverj
Communicator

But how do you apply the host=Location1_CompanyB_Routers definition?
How does splunk define the Batman router is also a Location1_CompanyB_Routers

I thought host= was auto-generated (but can be overridden) from the hostname data extracted from the log?

0 Karma

woodcock
Esteemed Legend

Just try it. Create a search on the search bar that captures all events that should have the same eventtype. Be sure to use as many wildcards as you can (e.g. host=CompanyB), that way you will have to do less upkeep as the values in your dataset expand. In your example, like this:

[Location1_CompanyB_Routers ]
search = index=MyIndex host=*CompanyB* OR host=Batman

Then you do a search like this:

eventtype=Location1_CompanyB_Routers 

And this will show the same events.

0 Karma

oliverj
Communicator

I get the event type.
But, if I search for host=CompanyB*, it will not find anything, as that host does not exist.
How do I create that host= reference?

0 Karma

woodcock
Esteemed Legend

After you create all the eventtypes, you search like this:

eventtypes=*CompanyB*

But first, the eventtype definitions come from YOU! You have to have a canonical list of things that constitute "CompanyB" and YOU have to create a search that captures those events and save that as an eventtype. I don't know why this is so hard except that you are thinking too much like tags. Read the dox and I am sure it will be clear to you:

http://docs.splunk.com/Documentation/Splunk/6.2.4/admin/Eventtypesconf

0 Karma

woodcock
Esteemed Legend

Going back to a tags-based mindset. Take every tag you would have created for CompanyB and put them all together in an eventtype definition like this:

[CompanyB]
host=a OR host=b OR host=c OR sourcetype=from_xyz OR ....

Then keep it up-to-date as things change and search like this:

eventtype="CompanyB"

But you would not do it exactly this way because you are creating hierarchies so you would only create the deepest/tripled hierarchy such as:

[CompanyB_Comm_Unix]

And you would search based on the piece of the hierarchy that matters:
For "Location", like this:

eventtype="SomeLocation_*"

For "group", like this:

eventtype="*_Somegroup_*"

For "OS", like this:

eventtype="*_someOS"
0 Karma

oliverj
Communicator

This concept I am following along quite well.
I think it will be my best approach for now.

0 Karma

acharlieh
Influencer

I'm curious about the reason for creating individual tags with single nested levels like that, as opposed to creating separate tags for the different axes of data, (e.g. "Location_1", "Owner_Comm", "Type_Unix"), since Location, Owner, and Type, don't seem like hierarchical concepts to me so I'm struggling as to the reason why you'd want to coerce them into a hierarchy?

With separate tags, you can still have the interesting searches that you have, as well as any number of intersecting searches (even ones that you might not have built in your hierarchy):
* All Failed logins in location 1: tag=Location_1 eventtype=failed_login
* All Failed logins on all Unix devices: tag=Type_Unix eventtype=failed_login
* All Failed logins for the comm group: tag=Owner_Comm eventtype=failed_login
* All Failed logins for the comm unix group in location 1: tag=Owner_Comm tag=Type_Unix tag=Location_1 eventtype=failed_login

Furthermore, with separate tags, this enables you to add other axes of data easily in the future as well. (think... Now you have to label systems that are included in scope for PCI or not).

0 Karma

oliverj
Communicator

Sample scenario:
An audit is being performed on "company A" at "location 3". (We support audit logs for company a, b, c, and d at multiple locations).

The auditor is specifically looking for usage of "x" user from "company A" logging into a unix cluster.

We know it happened at "Location 3".
We know it was equipment managed by "Company A".
We know the devices in question were unix-based.

This helps our auditor narrow down the search quickly. (From there he can expand back out, to just "Company A at location 3"

Your last example:
All Failed logins for the comm unix group in location 1: tag=Owner_Comm tag=Type_Unix tag=Location_1 eventtype=failed_login
also seems to be an excellent way of doing this.
The user-friendliness of the searches are hugely important, but I can generate pre-canned ones for them to work off of.
Nesting would be easier, but separating the tags might actually be simpler to manage in the long run. Thank you for the input.

0 Karma

acharlieh
Influencer

Glad to help! Is there often a need to audit across companies or are most searches constrained to specific companies? Is there a need to adjust log retention by type of device at a particular company? do you have Splunk users who should only be allowed to search company A but not Company B?

Something else to consider could be to have separate indexes for some of the properties (Maybe for company or company / type of device, or company/location). Indexes are the level at which you could adjust retention (in size and/or age), as well as the level at which you can grant access to data to individual roles (so if user 1 should only be allowed to search company A but not Company B... you can ensure user 1 is a member of only the role that has access to search company A's indexes. Users can have multiple roles and roles can inherit permissions from other roles as well). If the majority of your searches are focused on a specific company, by having indexes for individual companies, you cut out all of the other companies you don't need to search without even opening any index files. (indexes are physically directories on disk). As you scale volume wise this could mean speed savings on your queries (again depending on the access pattern)

0 Karma

oliverj
Communicator

Our auditor has several reports he does -- and a lot of the reports are 2 part -- broken down by function (networking) and then a secondary report per company.
How hard is the overall network being hit on this time period, and which group is responsible for the majority of the traffic rejects.

And thank you for the comment on roles -- Looks like if I properly tag different groups, then I can assign a role "ComapnyZ" that can only access devices with a "CompanyZ" tag.
I will also look at indexes -- so far, I had been putting everything into one, but this might be a good item to separate.

0 Karma
Get Updates on the Splunk Community!

Splunk Smartness with Brandon Sternfield | Episode 3

Hello and welcome to another episode of "Splunk Smartness," the interview series where we explore the power of ...

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...