Getting Data In

The TCP output processor has paused the data flow

splunkcol
Contributor

The device sends the logs by means of syslog to the heavy forwarder who receives it, stores it and tries to send it to the indexers, but the errors that I attach appear.

 

Search head 1                      Search Head2            

            ↑                                       ↑

Index1                                     index2

            ↑                                      ↑
Heavy Forwarder  Heavy Forwarder

 

Error 1:
09-07-2020 01:19:40.949 -0500 WARN TcpOutputProc - The TCP output processor has paused the data flow. Forwarding to host_dest=(ip of indexer) inside output group default-autolb-group from host_src=(ip folder source) has been blocked for blocked_seconds=100. This can stall the data flow towards indexing and other network outputs. Review the receiving system's health in the Splunk Monitoring Console. It is probably not accepting data.

splunkcol_0-1599500637706.png

 

Error 2 
09-07-2020 10:50:05.169 -0500 WARN TailReader - Enqueuing a very large file=/xxx/xxx/xxxxxx/2020/09/07/user.log in the batch reader, with bytes_to_read=1846637230, reading of other large files could be delayed

All

splunkcol_1-1599501139159.png

 

 

0 Karma
1 Solution

soutamo
SplunkTrust
SplunkTrust
Definitely you must go through these with people which have enough understanding for ES. Unfortunately I cannot help you with this (yet). But your main issue seems to be too much stuff running on one virtual machine at the same time.
If you didn’t found anyone here, then you should contact your local splunk partner or directly to splunk PS to get help with this.
r. Ismo

View solution in original post

splunkcol
Contributor

@soutamo  these are the jobs that I see in real time

splunkcol_0-1600108651335.png

317 results with this query 😅

| rest splunk_server=local /servicesNS/-/-/saved/searches
| search is_scheduled=1 disabled=0
| fields dispatch.earliest_time dispatch.latest_time eai:acl.owner eai:acl.sharing search cron_schedule title
| rename dispatch.earliest_time as earliest_time, dispatch.latest_time as latest_time, eai:acl.owner as owner, eai:acl.sharing as sharing
| table title cron_schedule earliest_time latest_time schedule_window owner sharing

cron_schedule 0 * * * * = 149

titlecron_schedule
Audit - Script Errors0 * * * *
Endpoint - High Number Of Infected Hosts - Rule0 * * * *
Endpoint - Update Signature Reference - Lookup Gen0 * * * *
ESCU - Access LSASS Memory for Dump Creation - Rule0 * * * *
ESCU - Attempt To Add Certificate To Untrusted Store - Rule0 * * * *
ESCU - Attempt To Set Default PowerShell Execution Policy To Unrestricted or Bypass - Rule0 * * * *
ESCU - Attempt To Stop Security Service - Rule0 * * * *
ESCU - Attempted Credential Dump From Registry via Reg exe - Rule0 * * * *
ESCU - Batch File Write to System32 - Rule0 * * * *
ESCU - Child Processes of Spoolsv exe - Rule0 * * * *
ESCU - Clients Connecting to Multiple DNS Servers - Rule0 * * * *
ESCU - Common Ransomware Extensions - Rule0 * * * *
ESCU - Common Ransomware Notes - Rule0 * * * *
ESCU - Create local admin accounts using net exe - Rule0 * * * *
ESCU - Create or delete windows shares using net exe - Rule0 * * * *
ESCU - Create Remote Thread into LSASS - Rule0 * * * *
ESCU - Creation of Shadow Copy - Rule0 * * * *
ESCU - Creation of Shadow Copy with wmic and powershell - Rule0 * * * *
ESCU - Credential Dumping via Copy Command from Shadow Copy - Rule0 * * * *
ESCU - Credential Dumping via Symlink to Shadow Copy - Rule0 * * * *
ESCU - Deleting Shadow Copies - Rule0 * * * *
ESCU - Detect Activity Related to Pass the Hash Attacks - Rule0 * * * *
ESCU - Detect API activity from users without MFA - Rule0 * * * *
ESCU - Detect attackers scanning for vulnerable JBoss servers - Rule0 * * * *
ESCU - Detect Credential Dumping through LSASS access - Rule0 * * * *
ESCU - Detect DNS requests to Phishing Sites leveraging EvilGinx2 - Rule0 * * * *
ESCU - Detect Excessive Account Lockouts From Endpoint - Rule0 * * * *
ESCU - Detect Excessive User Account Lockouts - Rule0 * * * *
ESCU - Detect hosts connecting to dynamic domain providers - Rule0 * * * *
ESCU - Detect Long DNS TXT Record Response - Rule0 * * * *
ESCU - Detect malicious requests to exploit JBoss servers - Rule0 * * * *
ESCU - Detect Mimikatz Using Loaded Images - Rule0 * * * *
ESCU - Detect Mimikatz Via PowerShell And EventCode 4703 - Rule0 * * * *
ESCU - Detect mshta exe running scripts in command-line arguments - Rule0 * * * *
ESCU - Detect new API calls from user roles - Rule0 * * * *
ESCU - Detect New Local Admin account - Rule0 * * * *
ESCU - Detect New Login Attempts to Routers - Rule0 * * * *
ESCU - Detect New Open S3 buckets - Rule0 * * * *
ESCU - Detect Oulook exe writing a  zip file - Rule0 * * * *
ESCU - Detect Path Interception By Creation Of program exe - Rule0 * * * *
ESCU - Detect processes used for System Network Configuration Discovery - Rule0 * * * *
ESCU - Detect Prohibited Applications Spawning cmd exe - Rule0 * * * *
ESCU - Detect PsExec With accepteula Flag - Rule0 * * * *
ESCU - Detect S3 access from a new IP - Rule0 * * * *
ESCU - Detect Spike in Network ACL Activity - Rule0 * * * *
ESCU - Detect Spike in S3 Bucket deletion - Rule0 * * * *
ESCU - Detect Spike in Security Group Activity - Rule0 * * * *
ESCU - Detect Unauthorized Assets by MAC address - Rule0 * * * *
ESCU - Detect USB device insertion - Rule0 * * * *
ESCU - Detect Use of cmd exe to Launch Script Interpreters - Rule0 * * * *
ESCU - Detect web traffic to dynamic domain providers - Rule0 * * * *
ESCU - Detection of DNS Tunnels - Rule0 * * * *
ESCU - Detection of tools built by NirSoft - Rule0 * * * *
ESCU - Disabling Remote User Account Control - Rule0 * * * *
ESCU - DNS Query Length Outliers - MLTK - Rule0 * * * *
ESCU - DNS Query Length With High Standard Deviation - Rule0 * * * *
ESCU - DNS Query Requests Resolved by Unauthorized DNS Servers - Rule0 * * * *
ESCU - DNS record changed - Rule0 * * * *
ESCU - Dump LSASS via comsvcs DLL - Rule0 * * * *
ESCU - Email Attachments With Lots Of Spaces - Rule0 * * * *
ESCU - Email files written outside of the Outlook directory - Rule0 * * * *
ESCU - Email servers sending high volume traffic to hosts - Rule0 * * * *
ESCU - Excessive DNS Failures - Rule0 * * * *
ESCU - Execution of File with Multiple Extensions - Rule0 * * * *
ESCU - Execution of File With Spaces Before Extension - Rule0 * * * *
ESCU - Extended Period Without Successful Netbackup Backups - Rule0 * * * *
ESCU - File with Samsam Extension - Rule0 * * * *
ESCU - First Time Seen Child Process of Zoom - Rule0 * * * *
ESCU - First time seen command line argument - Rule0 * * * *
ESCU - First Time Seen Running Windows Service - Rule0 * * * *
ESCU - GCP GCR container uploaded - Rule0 * * * *
ESCU - GCP Kubernetes cluster scan detection - Rule0 * * * *
ESCU - Hiding Files And Directories With Attrib exe - Rule0 * * * *
ESCU - Hosts receiving high volume of network traffic from email server - Rule0 * * * *
ESCU - Identify New User Accounts - Rule0 * * * *
ESCU - Kerberoasting spn request with RC4 encryption - Rule0 * * * *
ESCU - Large Volume of DNS ANY Queries - Rule0 * * * *
ESCU - MacOS - Re-opened Applications - Rule0 * * * *
ESCU - Malicious PowerShell Process - Connect To Internet With Hidden Window - Rule0 * * * *
ESCU - Malicious PowerShell Process - Encoded Command - Rule0 * * * *
ESCU - Malicious PowerShell Process - Execution Policy Bypass - Rule0 * * * *
ESCU - Malicious PowerShell Process - Multiple Suspicious Command-Line Arguments - Rule0 * * * *
ESCU - Malicious PowerShell Process With Obfuscation Techniques - Rule0 * * * *
ESCU - Monitor DNS For Brand Abuse - Rule0 * * * *
ESCU - Monitor Email For Brand Abuse - Rule0 * * * *
ESCU - Monitor Registry Keys for Print Monitors - Rule0 * * * *
ESCU - Monitor Web Traffic For Brand Abuse - Rule0 * * * *
ESCU - No Windows Updates in a time frame - Rule0 * * * *
ESCU - Open Redirect in Splunk Web - Rule0 * * * *
ESCU - Osquery pack - ColdRoot detection - Rule0 * * * *
ESCU - Overwriting Accessibility Binaries - Rule0 * * * *
ESCU - Process Execution via WMI - Rule0 * * * *
ESCU - Processes created by netsh - Rule0 * * * *
ESCU - Processes launching netsh - Rule0 * * * *
ESCU - Processes Tapping Keyboard Events - Rule0 * * * *
ESCU - Prohibited Network Traffic Allowed - Rule0 * * * *
ESCU - Prohibited Software On Endpoint - Rule0 * * * *
ESCU - Protocol or Port Mismatch - Rule0 * * * *
ESCU - Protocols passing authentication in cleartext - Rule0 * * * *
ESCU - Reg exe Manipulating Windows Services Registry Keys - Rule0 * * * *
ESCU - Reg exe used to hide files directories via registry keys - Rule0 * * * *
ESCU - Registry Keys for Creating SHIM Databases - Rule0 * * * *
ESCU - Registry Keys Used For Persistence - Rule0 * * * *
ESCU - Registry Keys Used For Privilege Escalation - Rule0 * * * *
ESCU - Remote Desktop Network Bruteforce - Rule0 * * * *
ESCU - Remote Desktop Network Traffic - Rule0 * * * *
ESCU - Remote Desktop Process Running On System - Rule0 * * * *
ESCU - Remote Process Instantiation via WMI - Rule0 * * * *
ESCU - Remote Registry Key modifications - Rule0 * * * *
ESCU - Remote WMI Command Attempt - Rule0 * * * *
ESCU - RunDLL Loading DLL By Ordinal - Rule0 * * * *
ESCU - Samsam Test File Write - Rule0 * * * *
ESCU - Sc exe Manipulating Windows Services - Rule0 * * * *
ESCU - Scheduled Task Name Used by Dragonfly Threat Actors - Rule0 * * * *
ESCU - Scheduled tasks used in BadRabbit ransomware - Rule0 * * * *
ESCU - Schtasks scheduling job on remote system - Rule0 * * * *
ESCU - Schtasks used for forcing a reboot - Rule0 * * * *
ESCU - Script Execution via WMI - Rule0 * * * *
ESCU - Shim Database File Creation - Rule0 * * * *
ESCU - Shim Database Installation With Suspicious Parameters - Rule0 * * * *
ESCU - Short Lived Windows Accounts - Rule0 * * * *
ESCU - Single Letter Process On Endpoint - Rule0 * * * *
ESCU - SMB Traffic Spike - MLTK - Rule0 * * * *
ESCU - SMB Traffic Spike - Rule0 * * * *
ESCU - Spectre and Meltdown Vulnerable Systems - Rule0 * * * *
ESCU - Spike in File Writes - Rule0 * * * *
ESCU - SQL Injection with Long URLs - Rule0 * * * *
ESCU - Suspicious Changes to File Associations - Rule0 * * * *
ESCU - Suspicious Email - UBA Anomaly - Rule0 * * * *
ESCU - Suspicious Email Attachment Extensions - Rule0 * * * *
ESCU - Suspicious File Write - Rule0 * * * *
ESCU - Suspicious LNK file launching a process - Rule0 * * * *
ESCU - Suspicious wevtutil Usage - Rule0 * * * *
ESCU - System Processes Run From Unexpected Locations - Rule0 * * * *
ESCU - TOR Traffic - Rule0 * * * *
ESCU - Uncommon Processes On Endpoint - Rule0 * * * *
ESCU - Unload Sysmon Filter Driver - Rule0 * * * *
ESCU - Unusually Long Command Line - MLTK - Rule0 * * * *
ESCU - Unusually Long Command Line - Rule0 * * * *
ESCU - USN Journal Deletion - Rule0 * * * *
ESCU - Web Fraud - Password Sharing Across Accounts - Rule0 * * * *
ESCU - Web Servers Executing Suspicious Processes - Rule0 * * * *
ESCU - Windows Event Log Cleared - Rule0 * * * *
ESCU - Windows hosts file modification - Rule0 * * * *
ESCU - WMI Permanent Event Subscription - Rule0 * * * *
ESCU - WMI Permanent Event Subscription - Sysmon - Rule0 * * * *
ESCU - WMI Temporary Event Subscription - Rule0 * * * *
Threat - Refresh Governance - Administrative0 * * * *
Threat - Refresh Reviewstatuses - Administrative0 * * * *
0 Karma

soutamo
SplunkTrust
SplunkTrust
Definitely you must go through these with people which have enough understanding for ES. Unfortunately I cannot help you with this (yet). But your main issue seems to be too much stuff running on one virtual machine at the same time.
If you didn’t found anyone here, then you should contact your local splunk partner or directly to splunk PS to get help with this.
r. Ismo

View solution in original post

splunkcol
Contributor

thanks for your help

0 Karma

soutamo
SplunkTrust
SplunkTrust

That error means that for some reason your indexer didn’t want to receive more logs to indexing those before it has managed what it is currently doing. What this is can be found with monitoring console. If you have installed it as distributed mode then you can easily check that from one place otherwise you must check it one by one. You could found MC under Settings. Then select indexing performance (individual node) and you see where the bottle neck is. If needed look also from resources items.

r. Ismo

0 Karma

splunkcol
Contributor

Hello, thank you very much for your answer

This is the information that appears, the last time that the log came from the heavy forwarder to the indexers was on September 3

According to these graphs, what could be analyzed?

splunkcol_0-1599506959575.png

 

splunkcol_1-1599507003936.png

splunkcol_2-1599507076412.png

splunkcol_3-1599507122499.png

 

0 Karma

soutamo
SplunkTrust
SplunkTrust

This last picture shows that you have full indexing queue which then blocks all other queues and also input from HF. Can you also post status of your resource usage (Resource usage: Machine) for disk especially IOPS?

0 Karma

splunkcol
Contributor

Sorry 

splunkcol_0-1599598082031.png

 

0 Karma

soutamo
SplunkTrust
SplunkTrust
I mean those on MC side.
0 Karma

splunkcol
Contributor

Done

I clarify that this image is from the search head

0 Karma

soutamo
SplunkTrust
SplunkTrust
More interesting is resource node/host than instance. And especially IOPS charts.
0 Karma

splunkcol
Contributor

 

splunkcol_0-1599598414645.png

splunkcol_1-1599598629056.png

splunkcol_2-1599598668509.png

splunkcol_3-1599598701116.png

splunkcol_4-1599598736328.png

splunkcol_5-1599598765690.png

splunkcol_6-1599598810915.png

 

0 Karma

soutamo
SplunkTrust
SplunkTrust
Thanks. Can you add also Average /O Usage and Performance with Mount Point = Your splunk filesystems and Overlay I/O Bandwidth Utilization.
This show how utilised your I/O systems is. Also Service Time and Wait Times are good to know.
0 Karma

splunkcol
Contributor

I have not been able to share the information since the monitor is not showing me data now 😥

 

"Search results might be incomplete: the search process on the peer:indexxxxx ended prematurely. Check the peer log, such as $SPLUNK_HOME/var/log/splunk/splunkd.log and as well as the search.log for the particular search."

 

0 Karma

soutamo
SplunkTrust
SplunkTrust
This also can indicate that you have disk I/O performances issue. Basically you haven't enough IOPS on those disks which you are using. Absolutely minimum is 800 IOPS for each node which you have. I prefer that this is much more.

So my proposal is that you must get more disk IOPS. Is this a new disk, new host or what is depending on your system.
r. Ismo
0 Karma

splunkcol
Contributor

Because I cannot see IOPS in the Monitor Console, I did the validation directly from linux with the command iotop -o

Splunk Enterprise Security =High use of IOPS
Indexer = High use of IOPS
indexer2  = High use of IOPS
HF = Low use
HF = Low use

I see that the server is a virtual machine, should I request that IOPS be increased in WM?

 

splunkcol_0-1600104113737.png

 

0 Karma

soutamo
SplunkTrust
SplunkTrust
You seems to be have quite many real-time searches running (if I understood right that screenshot). My feeling is that with your current configurations your node is totally over booked (not only I/O wise, but also RT queries etc.). @richgalloway, can you refer those ES setups etc. as probably there are something which can help your situation?
R. Ismo
0 Karma

splunkcol
Contributor

I do not have control of the physical or local infrastructure, I only have access to the servers.

0 Karma