Splunk Search

tstats with where condition | StatsFileWriterLz4 file open failed file=C:\Splunk\var\run\splunk\srtemp\910252184_17768_a

iamtheclient20
Explorer

I have a SPL, when first running the result is appearing but once the query is finished the error have shown below:

| tstats `summariesonly`
count(All_Traffic.dest_ip) as destination_ip_count,
count(All_Traffic.src_ip) as source_ip_count,
count(All_Traffic.dest_port) as destination_port_count,
count(All_Traffic.src_port) as source_port_count
from datamodel=Network_Traffic.All_Traffic
by
All_Traffic.src_ip,
All_Traffic.src_port,
All_Traffic.dest_ip,
All_Traffic.protocol,
All_Traffic.src_zone,
All_Traffic.protocol_version,
All_Traffic.action,
_time
| lookup 3rd_party_network_connections_vendor_ip.csv index_ip as All_Traffic.src_ip OUTPUT value_ip
| where isnotnull(value_ip) AND All_Traffic.src_port !="53" AND (All_Traffic.action="blocked" OR All_Traffic.action="denied" OR All_Traffic.action="failed") AND source_ip_count > 40 AND destination_ip_count > 40



-----------------------

The error

StatsFileWriterLz4 file open failed file=C:\Splunk\var\run\splunk\srtemp\910252184_17768_at_1638875294.1\statstmp_merged_5.sb.lz4

 

-------------

May you validate if my SPL query is correct or not?

Thanks

 

 

Labels (2)
Tags (1)

sideview
SplunkTrust
SplunkTrust

We are seeing the same thing too.   We are seeing it with a long running job (~20mins) that is pulling a few million records.

Note that the file splunkd is trying to open is in /var/run/ but not in /var/run/dispatch/<sid>   so this is some internal secret cache that splunkd is using.

The relevant excerpt from the (frankly enormous) search.log is below.

tl;dr is that it says the file is being used by another process.
A couple seconds later it tries to remove the directory and that fails because the directory is not empty.
And then a few more seconds and suddenly it triggers splunkd shutdown,   in the search.log.  which.... makes no sense because customer did not report that splunk shut down.  

 

.....
nothing too interesting and then
.....
01-06-2022 12:33:09.439 INFO UserManager [7352 SearchResultExecutorThread] - Unwound user context: admin -> NULL
01-06-2022 12:33:09.439 INFO UserManager [7700 SearchResultExecutorThread] - Unwound user context: admin -> NULL
01-06-2022 12:33:09.457 INFO ResultsCollationProcessor [4780 phase_1] - Writing remote_event_providers.csv to disk
01-06-2022 12:33:09.459 INFO DispatchExecutor [4780 phase_1] - END OPEN: Processor=stats
01-06-2022 12:33:09.459 INFO DispatchExecutor [4780 phase_1] - BEGIN OPEN: Processor=timechart
01-06-2022 12:33:09.673 ERROR StatsFileIO [4780 phase_1] - open failed for file=C:\Program Files\Splunk\var\run\splunk\srtemp\2075148042_2268_at_1641492677.2\statstmp_merged_0.sb.lz4, error: The process cannot access the file because it is being used by another process.
01-06-2022 12:33:09.674 INFO ReducePhaseExecutor [4780 phase_1] - Not downloading remote search.log files. Reason: No remote event providers.
01-06-2022 12:33:09.674 INFO ReducePhaseExecutor [4780 phase_1] - Not downloading remote search_telemetry.json files. Reason: No remote event providers.
01-06-2022 12:33:09.674 INFO ReducePhaseExecutor [4780 phase_1] - Ending phase_1
01-06-2022 12:33:09.674 INFO UserManager [4780 phase_1] - Unwound user context: admin -> NULL
01-06-2022 12:33:09.674 ERROR SearchOrchestrator [8764 searchOrchestrator] - Phase_1 failed due to : StatsFileWriterLz4 file open failed file=C:\Program Files\Splunk\var\run\splunk\srtemp\2075148042_2268_at_1641492677.2\statstmp_merged_0.sb.lz4
01-06-2022 12:33:09.674 INFO ReducePhaseExecutor [5628 StatusEnforcerThread] - ReducePhaseExecutor=1 action=CANCEL
01-06-2022 12:33:09.674 INFO DispatchExecutor [5628 StatusEnforcerThread] - User applied action=CANCEL while status=0
01-06-2022 12:33:09.675 ERROR SearchStatusEnforcer [5628 StatusEnforcerThread] - sid:1641492672.2086 StatsFileWriterLz4 file open failed file=C:\Program Files\Splunk\var\run\splunk\srtemp\2075148042_2268_at_1641492677.2\statstmp_merged_0.sb.lz4
01-06-2022 12:33:09.675 INFO SearchStatusEnforcer [5628 StatusEnforcerThread] - State changed to FAILED due to: StatsFileWriterLz4 file open failed file=C:\Program Files\Splunk\var\run\splunk\srtemp\2075148042_2268_at_1641492677.2\statstmp_merged_0.sb.lz4
01-06-2022 12:33:09.703 INFO UserManager [5628 StatusEnforcerThread] - Unwound user context: admin -> NULL
01-06-2022 12:33:09.710 INFO DispatchStorageManager [8764 searchOrchestrator] - Remote storage disabled for search artifacts.
01-06-2022 12:33:09.711 INFO DispatchManager [8764 searchOrchestrator] - DispatchManager::dispatchHasFinished(id='1641492672.2086', username='admin')
01-06-2022 12:33:09.711 INFO UserManager [8764 searchOrchestrator] - Unwound user context: admin -> NULL
01-06-2022 12:33:09.711 INFO SearchOperator:kv [6548 RunDispatch] - Extractor stats: name=_autoKvStats, probes=80, total_time_ms=1200, max_time_ms=15
01-06-2022 12:33:10.248 INFO ISearchOperator [6548 RunDispatch] - 0x30cebfd120 PREAD_HISTOGRAM: usec_1_8=4644 usec_8_64=0 usec_64_512=0 usec_512_4096=61 usec_4096_32768=0 usec_32768_262144=0 usec_262144_INF=0
01-06-2022 12:33:11.579 WARN StatsProcessorV2 [6548 RunDispatch] - Failed to remove temporary directory: C:\Program Files\Splunk\var\run\splunk\srtemp\2075148042_2268_at_1641492677.2: 1 errors occurred. Description for first 1: [{operation:"failed to remove directory", error:"The directory is not empty.", file:"C:\Program Files\Splunk\var\run\splunk\srtemp\2075148042_2268_at_1641492677.2"}]
01-06-2022 12:33:16.835 INFO SearchStatusEnforcer [6548 RunDispatch] - SearchStatusEnforcer is already terminated
01-06-2022 12:33:16.837 INFO UserManager [6548 RunDispatch] - Unwound user context: admin -> NULL
01-06-2022 12:33:16.837 INFO LookupDataProvider [6548 RunDispatch] - Clearing out lookup shared provider map
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - Shutting down splunkd
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_Begin"
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_NoahHealthReport"
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_FileIntegrityChecker"
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_JustBeforeKVStore"
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_KVStore"
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_DFM"
01-06-2022 12:33:16.846 INFO ShutdownHandler [7912 Shutdown] - shutting down level "ShutdownLevel_Thruput"
.....
And then it goes back to tens of thousands of messages about various calcfields not being extracted because they're not needed. 
.....

 

0 Karma

cboillot
Contributor

I'm seeing the same thing. I am on version 8.2.1. I have read where upgrading to 8.2.4 fixed it. Is this true? Have we figured out the cause?

Tags (1)
0 Karma

burwell
SplunkTrust
SplunkTrust

We are on 8.2.7 and just saw this error so not fixed in 8.2.4

It was a very long running job with a lot, lot of events.

0 Karma

iamtheclient20
Explorer

Hi, you mean in 8.2.4 if the search query took so long to finish it will return that kind of error like statsfilewriter?

0 Karma

burwell
SplunkTrust
SplunkTrust

My point was someone asked if fixed in 8.2.4 and it is not.

I ran the search as admin and it should not have failed. There are no other errors for this head at that time so I believe this is a bug.

0 Karma

iamtheclient20
Explorer

Ah I see, thanks. 

0 Karma

iamtheclient20
Explorer

Hi, you mean in 8.2.4 if the search query took so long to finish it will return that kind of error like statsfilewriter. 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Check the search log for messages related to the "file open failed" message.  Then correct the file system error (permissions, etc.).

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...