why are my lookup files not being merged into identities_expanded.csv ?
There are several things that could be causing this problem.
$SPLUNK_HOME/etc/system/local/server.conf
max_content_length =
* Measured in bytes
* HTTP requests over this size will rejected.
* Exists to avoid allocating an unreasonable amount of memory from web
requests
* Defaulted to 838860800 or 800MB
* In environments where indexers have enormous amounts of RAM, this
number can be reasonably increased to handle large quantities of
bundle data.
If pushing the bundle from the deployer and using the preserve-lookups flag, that will not update lookups on the members but preserve their local lookup files instead of take what the deployer is pushing. This feature is somewhat limited becuase it doesn't allow you to get granular and specify which lookups to preserve.
ie:
./splunk apply shcluster-bundle -target https://linux01.sv.splunk.com:8089 -preserve-lookups true -auth admin:pwd
lookup files that don't have their inputs or transforms.conf configured properly so the identity_manager.py script is not picking them up
example:
rob1_identities.csv is my file I want merged into identities_expanded.csv
on the SHC deployer I configure:
/opt/splunk/etc/shcluster/apps/SA-IdentityManagement/local/inputs.conf
[identity_manager://rob1_identities]
disabled = 0
category = rob1_identities
description = Demonstration identity list.
target = identity
url = lookup://rob1_identities
/opt/splunk/etc/shcluster/apps/SA-IdentityManagement/local/transforms.conf
[rob1_identities]
filename = rob1_identities.csv
make an update to the file rob1_identities.csv on the deployer
Next I push the bundle from the deployer to the SHC:
On deployer run this command (pushing to any 1 member in the SHC, ie: https://linux01.sv.splunk.com:8089 😞
./splunk apply shcluster-bundle -target https://linux01.sv.splunk.com:8089 -auth admin:pwd
confirm each member has the updated file.
$SPLUNK_HOME/etc/apps/SA-IdentityManagement/lookups/rob1_identities.csv
now the next time the identity_manager.py runs (every 5 min) it will merge the change into identities_expanded.csv
Any manual file update on one of the SHC members will not work unless the member is the captain. And in that case only the identies_expanded.csv would get replicated to other members not the rob1_identities.csv so this approach is not advised.
Confirm your file has the correct header format for identities.csv
The most common reason for failure is incorrect formatting or invalid data in the assets.csv or identities.csv lookup files used as the source.
The header must be included in the file and be in this format for identities.csv:
identity,prefix,nick,first,last,suffix,email,phone,phone2,managedBy,priority,bunit,category,watchlist,startDate,endDate,work_city,work_country,work_lat,work_long
There are several ways to update the identities.csv file on the SHC members and trigger the merge of the file(s) into identities_expanded.csv
a.) option 1: run a search with outputlookup to update the file:
ie:
... | table identity, prefix, nick, first, last, suffix, email, phone, phone2, managedBy, priority, bunit, category, watchlist, startDate, endDate, work_city, work_country, work_lat, work_long | outputlookup identities.csv
b.) option 2: drop the updated identites.csv file onto each member using rsync
c.) option 3: push the updated identities.csv from the SHC deployer down to a SHC member
./splunk apply shcluster-bundle -target https://rplinux09.sv.splunk.com:8089 -auth admin:pwd
This approach may be less favorable if you have frequent updates to the .csv as the deployer may trigger the SHs to restart for those configuration changes that require it.
Any CSV file must use Unix line endings.
look in $SPLUNK_HOME/var/log/splunk/python_modular_input.log on the SHC captain and see what ERROR logs are reported for your lookup file. This will provide insight into what the problem is.
A couple of FYIs:
inputs.conf
stanza in one context and transforms.conf
stanza in another (which may happen if you use the WebUI).