assets and identities lookups not merging into ide...

rphillips_splk · ‎06-10-2016

why are my lookup files not being merged into identities_expanded.csv ?

rphillips_splk · ‎06-10-2016

There are several things that could be causing this problem.

If pushing the bundle with the updated identities.csv file from the search head cluster deployer , the bundle could be too large therefore hitting the http server max content length of 800MB on the SHC members. You would see evidence of this on the SHC member you push to in $SPLUNK_HOME/var/log/splunk/splunkd_access.log with logs with status code 413 If that is the case you could increase the max_content_length on the SHC members to work around that:

$SPLUNK_HOME/etc/system/local/server.conf
max_content_length =

* Measured in bytes
* HTTP requests over this size will rejected.
* Exists to avoid allocating an unreasonable amount of memory from web
requests
* Defaulted to 838860800 or 800MB
* In environments where indexers have enormous amounts of RAM, this
number can be reasonably increased to handle large quantities of
bundle data.

If pushing the bundle from the deployer and using the preserve-lookups flag, that will not update lookups on the members but preserve their local lookup files instead of take what the deployer is pushing. This feature is somewhat limited becuase it doesn't allow you to get granular and specify which lookups to preserve.
ie:
./splunk apply shcluster-bundle -target https://linux01.sv.splunk.com:8089 -preserve-lookups true -auth admin:pwd
lookup files that don't have their inputs or transforms.conf configured properly so the identity_manager.py script is not picking them up

example:

rob1_identities.csv is my file I want merged into identities_expanded.csv

on the SHC deployer I configure:

/opt/splunk/etc/shcluster/apps/SA-IdentityManagement/local/inputs.conf
[identity_manager://rob1_identities]
disabled = 0
category = rob1_identities
description = Demonstration identity list.
target = identity
url = lookup://rob1_identities

/opt/splunk/etc/shcluster/apps/SA-IdentityManagement/local/transforms.conf
[rob1_identities]
filename = rob1_identities.csv

make an update to the file rob1_identities.csv on the deployer
Next I push the bundle from the deployer to the SHC:
On deployer run this command (pushing to any 1 member in the SHC, ie: https://linux01.sv.splunk.com:8089 😞
./splunk apply shcluster-bundle -target https://linux01.sv.splunk.com:8089 -auth admin:pwd

confirm each member has the updated file.
$SPLUNK_HOME/etc/apps/SA-IdentityManagement/lookups/rob1_identities.csv
now the next time the identity_manager.py runs (every 5 min) it will merge the change into identities_expanded.csv
Any manual file update on one of the SHC members will not work unless the member is the captain. And in that case only the identies_expanded.csv would get replicated to other members not the rob1_identities.csv so this approach is not advised.
Confirm your file has the correct header format for identities.csv
The most common reason for failure is incorrect formatting or invalid data in the assets.csv or identities.csv lookup files used as the source.
The header must be included in the file and be in this format for identities.csv:
identity,prefix,nick,first,last,suffix,email,phone,phone2,managedBy,priority,bunit,category,watchlist,startDate,endDate,work_city,work_country,work_lat,work_long
There are several ways to update the identities.csv file on the SHC members and trigger the merge of the file(s) into identities_expanded.csv

a.) option 1: run a search with outputlookup to update the file:

ie:
... | table identity, prefix, nick, first, last, suffix, email, phone, phone2, managedBy, priority, bunit, category, watchlist, startDate, endDate, work_city, work_country, work_lat, work_long | outputlookup identities.csv

b.) option 2: drop the updated identites.csv file onto each member using rsync

c.) option 3: push the updated identities.csv from the SHC deployer down to a SHC member
./splunk apply shcluster-bundle -target https://rplinux09.sv.splunk.com:8089 -auth admin:pwd

This approach may be less favorable if you have frequent updates to the .csv as the deployer may trigger the SHs to restart for those configuration changes that require it.

Any CSV file must use Unix line endings.
look in $SPLUNK_HOME/var/log/splunk/python_modular_input.log on the SHC captain and see what ERROR logs are reported for your lookup file. This will provide insight into what the problem is.

nnmiller · ‎08-04-2016

A couple of FYIs:

If you run the modular input command manually (as suggested in "a" above) for these merges to try to debug what's going on with them, the results will not wind up in the correct SA-IdentityManagement context. In my tests they wound up in Searching & Reporting OR EnterpriseSecurity contexts. We ended up manually moving the results to SA-IdentityManager so as not to have to run the search again.
Modular input throws an error if there are spaces in any extra field names you pull in.
Errors like #2 may be masked by other problems, like inputs.conf stanza in one context and transforms.conf stanza in another (which may happen if you use the WebUI).

assets and identities lookups not merging into identities_expanded.csv in Splunk SA-IdentityManagment for Splunk Enterprise Security - Search Head Clustering

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)