Splunk Search

Help needed with foreach query for LDAP user information

D3mby
Explorer

Hey All,

Been banging my head for a few days with this one and will appreciate any feedback on the topic.
The scenario is the following:
- LDAP data is polled via ldapsearch and written to a csv lookup table - ldap_users
- The use case is to return all subordinates under a specific person in the company

Here's an example of the end goal:

Consider the following 5 lines in the csv lookup

cn               displayName              mail                                                  dn                                                                manager
dm123     Dean, Martin              dean.martin@test.eu        CN=dm123,OU=T,OU=E,OU=S                    -
ep123      Elvis, Presley              elvis.presley@test.eu        CN=ep123,OU=T,OU=E,OU=S                  dm123
mj123      Michael, Jordan        michael.jordan@test.eu   CN=mj123,OU=T,OU=E,OU=S                  ep123
bc123      Bill, Clinton                  bill.clinton@test.eu            CN=bc123,OU=T,OU=E,OU=S                  mj123
ba123      Buzz, Aldrin                buzz.aldrin@test.eu           CN=ba123,OU=T,OU=E,OU=S                  mj123

 

In this case, dm123 is the CEO of the company and is the direct manager of ep123. ep123 manages 1 person mj123, who then manages two people - bc123 and ba123.

What is needed - if I run a query for the user dm123 to receive everybody under them in the company structure, so in this case, all other 4 users.

So far I'm swinging at using foreach to iterate over the ldap tree, but I can't figure out how to do multiple iterations.
Here's my search so far:


| inputlookup ldap_users WHERE ( ( manager="CN=dm123,OU=T,OU=E,OU=S" ) )
| search cn!="x*" cn!="y*" cn!="z*"
| table cn, displayName, mail, dn
| rex field=dn "CN\=(?P<dn>[^\,]+)\,"

| foreach dn
[| lookup ldap_users manager as dn OUTPUTNEW mail as mail_employee]
| mvexpand mail_employee
| search mail_employee!="-"

 

In manager= in the above, you would enter the person from who the iteration should begin. In this case, the CEO. The output returned is the cn, displayName, mail, dn for all their direct subordinates - ep123.
I then take the dn for them (ep123) and do a foreach, looking for any users who have a manager who's equal to the dn - this would be mj123.  This provides me with the following output:

cn                displayName                       mail                                                  dn                                                mail_employee
ep123      Elvis, Presley       elvis.presley@test.eu    CN=ep123,OU=T,OU=E,OU=S   michael.jordan@test.eu 

 

This is good, but now I have two roadblocks:

- How to use mj123 (any piece of information-mail/dn/cn, doesn't matter which we chose for the example) to iterate once more and return bc123 and ba123.

 

Any help will be extremely appreciated!

 

 

Labels (1)
0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @D3mby,

Based on @yuanliu solution, I followed another way. Below works with sample data. Could you please give it a try? You should adapt lookup lines and temp m_* variables according your count of management levels. 

| inputlookup ldap_users.csv 
| eventstats values(cn) as underlings by manager 
| rename manager as m_3 
| lookup ldap_users.csv cn as m_3 OUTPUT manager as m_2 
| lookup ldap_users.csv cn as m_2 OUTPUT manager as m_1 
| lookup ldap_users.csv cn as m_1 OUTPUT manager as m_0 
| foreach m_* 
    [ eval mng='<<FIELD>>' 
    | eval managers=mvappend(managers,mng) ] 
| fields - m_* - mng - underlings 
| stats values(*) as * by managers 
| where managers!="-"
If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

yuanliu
SplunkTrust
SplunkTrust

A few months ago, there was another question about org chart, but the user quickly deleted the question.  Although the data format is different, you can use the same technique.

First, let's construct the org chart.  Assume this Dean Martin empire has no more than six levels of management.

 

| inputlookup ldap_users
| eval flatdata = cn . ":" . if(manager == "-", "", manager)
| eventstats values(flatdata) AS flatdata
| foreach 1 2 3 4 5 6 ``` assuming no more than 6 levels ```
    [| eval manager = mvappend(mvindex(split(mvindex(flatdata, mvfind(flatdata, "^" . mvindex(manager, 0) . ":")), ":"), 1), manager)]

 

What this gives you is a manager field that contains all upperlings all the way up to Dean Martin.

cndisplayNamedn
flatdata
mailmanager
dm123Dean, MartinCN=dm123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
dean.martin@test.eu-
ep123Elvis, PresleyCN=ep123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
elvis.presley@test.eu
 
dm123
mj123Michael, JordanCN=mj123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
michael.jordan@test.eu
 
dm123
ep123
bc123Bill, ClintonCN=bc123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
bill.clinton@test.eu
 
dm123
ep123
mj123
ba123Buzz, AldrinCN=ba123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
buzz.aldrin@test.eu
 
dm123
ep123
mj123

With this table, you can easily search up Elvis and all his underlings, e.g., using search command.

 

| search cn = ep123 OR manager = ep123 ``` gimme Elvis and his underlings ```

 

cndisplayNamedn
flatdata
mail
manager
ep123Elvis, PresleyCN=ep123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
elvis.presley@test.eu
 
dm123
mj123Michael, JordanCN=mj123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
michael.jordan@test.eu
 
dm123
ep123
bc123Bill, ClintonCN=bc123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
bill.clinton@test.eu
 
dm123
ep123
mj123
ba123Buzz, AldrinCN=ba123,OU=T,OU=E,OU=S
ba123:mj123
bc123:mj123
dm123:
ep123:dm123
mj123:ep123
buzz.aldrin@test.eu
 
dm123
ep123
mj123

If your goal is to lookup Elvis specifically, you can construct any display you like from the above.

If your goal is to lookup underlings of all managers, you first sum them up using stats:

 

| stats values(cn) as underlings by manager
| where manager != "-"

 

manager
underlings
dm123
ba123
bc123
ep123
mj123
ep123
ba123
bc123
mj123
mj123
ba123
bc123

Then apply your original lookup

 

| lookup ldap_users cn as underlings

 

Put all this together,

 

| inputlookup ldap_users
| eval flatdata = if(manager == "-", null(), cn . ":" . manager)
| eventstats values(flatdata) AS flatdata
| foreach 1 2 3 4 5 6
    [| eval manager = mvappend(mvindex(split(mvindex(flatdata, mvfind(flatdata, "^" . mvindex(manager, 0) . ":")), ":"), 1), manager)]
| stats values(cn) as underlings by manager
| where manager != "-"
| lookup ldap_users cn as underlings

 

 

The following is data emulation that you can play with and compare with your real data.

 

| makeresults
| fields - _time
| eval _raw = "
cn      displayName     mail                    dn                        manager                                 
dm123   Dean, Martin    dean.martin@test.eu     CN=dm123,OU=T,OU=E,OU=S   -
ep123   Elvis, Presley  elvis.presley@test.eu   CN=ep123,OU=T,OU=E,OU=S   dm123
mj123   Michael, Jordan michael.jordan@test.eu  CN=mj123,OU=T,OU=E,OU=S   ep123
bc123   Bill, Clinton   bill.clinton@test.eu    CN=bc123,OU=T,OU=E,OU=S   mj123
ba123   Buzz, Aldrin    buzz.aldrin@test.eu     CN=ba123,OU=T,OU=E,OU=S   mj123"
| multikv forceheader=1
| fields - _raw linecount
``` data emulation above for
| inputlookup ldap_users ```

 

D3mby
Explorer

Hey!
Thanks so much for this, it's really close to what I'm after 😁

The example works great with small datasets, but when you create the "flatdata" field, you basically output every combination of user+manager from all available events that you are fetching from the dataset - in my case, the csv.

D3mby_0-1680010786116.png


As you can see, I'm limiting the search to retrieve 10 results, meaning that each event's flatdata field will have 10 values(mv) in it, as it aggregates all user+manager combination from the other events as well.

This is also true when we use the dummy dataset:

D3mby_1-1680010892049.png



Now the csv that holds the user information is somewhere around 96K rows. 
The way I'm interpreting this, it will require each of these 96k rows to hold 96k entries about a user + manager combination in the flatdata field - no need to say, the search doesn't complete at all when I attempt this.


0 Karma

yuanliu
SplunkTrust
SplunkTrust

Yes, I have to admit that I have never run inputlookup on 96K lines.  Saving a 96K-value field with each row is probably not the best use of RAM.  However, since all Splunk gets is a file and not a database, the data will have to be held somewhere.  So, I don't see an easy way out.

Some thoughts.

  • Run an external query to LDAP itself and produce an underling array on-demand as @ITWhisperer suggested.  It will carry performance cost, of course.
  • Examine the use case more closely and make some choices.  See if it allows you to limit input.  For example, if there is some department code in LDAP and the org chart follows these codes.  Or if there is a employee level code and the level is automatically enforced with some known algorithm.
  • Run a query and store underlings chart in a CSV or KV store periodically.
  • If anything, examine limits.conf to lift some memory and timeout limits.  I do not quite know which knobs will help, however.  96K^2 = 9.2 billion.  That is a lot.

Speaking of limit inputs, you can restrict output of inputlookup to cn and manager because all other fields are irrelevant to this task.  But compared with the N^2 factor, this will not help the big problem.

D3mby
Explorer

My thoughts exactly, but your suggestion has helped me plenty.

Will experiment a bit further and post any positive results that I get. 

Tags (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Splunk essentially processes each event one at a time. Some commands work on the multiple events from the pipeline before producing an output event, but still each command processes the pipeline once.

The foreach command operates on each event one at a time for each field mentioned in the field list.

In order to do what you want, you would have to process the event pipeline multiple times. How many times you depend on your data. There is no way to tell ahead of time how many iterations would be required.

One way of solving this might be to write a custom command that essentially takes all the events and processes them as a set. Good luck with that. 😀

0 Karma

D3mby
Explorer

Thanks, thought there might be a quick way around this that I'm missing 

0 Karma
Get Updates on the Splunk Community!

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...

What's New in Splunk Cloud Platform 9.2.2406?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2406 with many ...

Enterprise Security Content Update (ESCU) | New Releases

In August, the Splunk Threat Research Team had 3 releases of new security content via the Enterprise Security ...