if you go down this path, make your life easier with a scripted alert action on your deployment server and an app for the transforms. It will automate deployment to the parsing nodes, and gives you some flexibility about where this filtering is applied. Basic algorithm: 1. Search for high volume 2. Trigger update alert script 3. Update app in deployment-apps: append to transform REGEX with offending data: just keep adding |NEW PATTERN on the end of the REGEX. If you only care about hosts then use [HOST::*] in props and SOURCE= MetaData:Host in your transform 4. Restart Splunk on deployment server. This is generally easier than trying to login from scripts. You can also trying doing this via REST API: https://docs.splunk.com/Documentation/Splunk/8.0.5/RESTREF/RESTsystem#server.2Fcontrol.2Frestart
... View more
Information related to searches, such as lookup tables, are replicated to the indexers directly via Knowledge Bundle Replication. Basically, every time the the local contents of an app on the search head changes, those changes are replicated to the search heads. This is why you only need apps on your indexers that perform index time operations. The system even keeps track of which search head the bundles are coming form so that search heads can have different versions of the same assets and the indexers use the right one for the right search. Be careful of large lookups, however. Because the bundles need to be sent every time lookups change, large bundles can create a significant search latency and can delay when the contents of the lookup are actually used in searches . For frequently changing lookups, consider preventing the the lookup from replicating by using local=true in your lookup search command. This will prevent the lookup from replicating and wait until data reaches the search head before performing the lookup. The advantage is instant access to new lookup values and no search latency due to frequent bundle updates. The downsides are that you can't leverage multiple indexers so filtering on lookup values may be slow (tip: do the lookup as late in the chain as possible), and you can't use the lookup in Data Models. If you have to use your frequently changing lookup on indexer, consider truncating the lookup frequently to ensure it is small. I do this for an ip-to-username lookup used in a an Accelerated data model. The source query only grabs the latest login per machine/user and only runs every hour so that the bundles are small and replicate less often Back to the larger question of automatic master node pushes, you can probably do this, but you shouldn't. If you set up your deployment server and master node correctly you can push packages to the Master node's master-apps directory instead of the app directory. And here is a presentation with the details of the command line to validate and push a cluster bundle. Combine that with cron and you have something like automated app pushes to the indexers. The biggest problems are: what happens if the bundle fails validation? What happens if there is a problem during deployment and rolling restart? I strongly recommend deploying cluster bundles during a maintenance window and manually monitoring the process. A safe partial automation is to set up deployment to the to the master node like above, so all you have to do from the GUI is validate and deploy. Another tip: consider creating Intermediate heavy Forwarders (IF) to capture your indexing traffic before it hits your index cluster, i.e. set forwarders to use the IF layer as their "indexers". This effectively removes parsing from your index layer AND means almost every app you would have deployed to your indexers now goes on the the intermediate forwarders, which will happily work with your deployment server. Search time configuration still gets replicated to the index cluster via Knowledge Bundle Replication.
... View more
Does your SOAR REST API accept a simple POST? Perhaps with an authtoken in the URL? then you can use the Webhook alert action to POST the results of a search to that url Otherwise you are looking at a custom alert action.
... View more
Datamodel acceleration doesn't work like report acceleration. No data is copied. Instead a pointer to the data in the index is created so all permissions apply to the index, not the datamodel.
If you want to show some fields and hide others it is tricky. You can play games with search filters on the role, but that is pretty fragile. If this is important you should clone the data into another sourcetype/index and redact the data you don't want to have certain users see. I think you can still use the same datamodel for both indexes, but you will get duplicate data.
CLONE_SOURCETYPE = <string>
* This name is wrong; a transform with this setting actually clones and
modifies events, and assigns the new events the specified sourcetype.
* If CLONE_SOURCETYPE is used as part of a transform, the transform will
create a modified duplicate event, for all events that the transform is
applied to via normal props.conf rules.
* Use this feature if you need to store both the original and a modified
form of the data in your system, or if you want to send the original and a
modified form to different outbound systems.
* A typical example would be to retain sensitive information according to
one policy and a version with the sensitive information removed
according to another policy. For example, some events may have data
that you must retain for 30 days (such as personally identifying
information) and only 30 days with restricted access, but you need that
event retained without the sensitive data for a longer time with wider
* Specifically, for each event handled by this transform, a near-exact copy
is made of the original event, and the transformation is applied to the
copy. The original event will continue along normal data processing
* The <string> used for CLONE_SOURCETYPE selects the sourcetype that will be
used for the duplicated events.
* The new sourcetype MUST differ from the the original sourcetype. If the
original sourcetype is the same as the target of the CLONE_SOURCETYPE,
Splunk will make a best effort to log warnings to splunkd.log, but this
setting will be silently ignored at runtime for such cases, causing the
transform to be applied to the original event without cloning.
* The duplicated events will receive index-time transformations & sed
commands all transforms which match its new host/source/sourcetype.
* This means that props matching on host or source will incorrectly be
applied a second time. (SPL-99120)
* Can only be used as part of of an otherwise-valid index-time transform. For
example REGEX is required, there must be a valid target (DEST_KEY or
WRITE_META), etc as above.
... View more
I just went through this.
The F5 app accelerates 50 models by default. Make sure you need all of those.
Basically every 5 minutes it is trying to run 3 acceleration searches times 50 models over three month period. That is absurd. I would start by disabling all of them and looking for what data you actually have and/or care about and selectively re-enable them as needed.
Also consider going into advanced acceleration settings and reducing the concurrent acceleration searches to 1 if you need to enable a bunch of these.
... View more