Splunk Search

Is there a way to search for all Splunk error messages? I'm looking for a solution to "Error in 'rex' command: Invalid argument: '(' The search job has failed due to an error."

MarkSplunker
Explorer

Question 1: Is there a centralized place to search for all Splunk error messages? Searching answers.splunk.com I've not been able to find a reference to, or solution for,

"Error in 'rex' command: Invalid argument: '(' The search job has failed due to an error. You may be able view the job in the Job Inspector."

Question 2: Why does this rex query work fine in a search, but then fail when used in both a primary and a subsearch? I need to parse fields in both places. I built an initial query that worked fine alone, then created a subsearch and copied/pasted the rex into it. It now fails with

"Error in 'rex' command: Invalid argument: '(' The search job has failed due to an error. You may be able view the job in the Job Inspector." 

What do you think is going on, and how do I fix it? The purpose is to find Devices with Tasks that failed at one time, but where a later Task succeeded. Thanks so much.

Here is the code, although for some reason the * asterisks after each dot (.) in the regexes don't seem to come through in the preview window:

source="File1.csv" index="inventory-legacy" | regex Notes="^Succ.*" | transaction Description | rex field=Description "^(?<TaskID>[^-]+).*" | rex field=Description "^[^-]+-(?<DeviceName>.*)" [ search source="File1.csv" index="inventory-legacy" | regex Notes="^Fail.*" | transaction Description | rex field=Description "^(?<TaskID>[^-]+).*" | rex field=Description "^[^-]+-(?<DeviceName>.*)" | dedup DeviceName, TaskID | fields DeviceName ] |sort  -_time, +TaskID, +DeviceName | table _time, TaskID, DeviceName, Description, Notes
Tags (4)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

To search for error messages, you'll need access to the _internal index. Replace "splunkd" with the name of any other Splunk log file you wish to view.

index=_internal source="*/splunkd.log" | ...

Why not combine your searches into one? I don't know if it will solve the problem, but simpler is usually better.

source="File1.csv" index="inventory-legacy" (Notes="Succ*" OR Notes="Fail*" | transaction Description | rex field=Description "^(?<TaskID>[^-]+).*" | rex field=Description "^[^-]+-(?<DeviceName>.*)" | dedup DeviceName, TaskID | sort  -_time, +TaskID, +DeviceName | table _time, TaskID, DeviceName, Description, Notes
---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

To search for error messages, you'll need access to the _internal index. Replace "splunkd" with the name of any other Splunk log file you wish to view.

index=_internal source="*/splunkd.log" | ...

Why not combine your searches into one? I don't know if it will solve the problem, but simpler is usually better.

source="File1.csv" index="inventory-legacy" (Notes="Succ*" OR Notes="Fail*" | transaction Description | rex field=Description "^(?<TaskID>[^-]+).*" | rex field=Description "^[^-]+-(?<DeviceName>.*)" | dedup DeviceName, TaskID | sort  -_time, +TaskID, +DeviceName | table _time, TaskID, DeviceName, Description, Notes
---
If this reply helps you, Karma would be appreciated.

MarkSplunker
Explorer

Hi, Rich. Thanks so much for responding. I guess I could have been clearer with my first question regarding "all Splunk error messages", but I am asking for a listing from Splunk of all error messages that their code generates, what causes each to trigger, and possibly how to fix the underlying cause of the problem. I am not asking how to view errors that have been logged in my system, but rather the meaning of any error message I encounter. Thanks again, especially if you have an answer to that.

As for Question 2 I will try the code you suggested.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Thanks for clarifying, Mark. I don't work for Splunk, but I'm pretty sure what you're asking for doesn't exist. I've been part of a lot of software projects and few of them were documented to the extent you seek. It's not that it can't be done, it's a difficult job in a large product if it hasn't been done and maintained since early days. Perhaps @ppablo can get an official answer for us.

---
If this reply helps you, Karma would be appreciated.

MarkSplunker
Explorer

Rich, Thanks again for your input. We'll see what happens.

0 Karma

ChrisG
Splunk Employee
Splunk Employee

I can confirm that we do not have a comprehensive error message reference as you describe it. Aside from the difficulty to create and maintain such a reference, given the extent of the code base, there are also multiple, varied conditions that can produce any given error message. It is not a simple one-to-one relationship.

With that said, we do have some work underway to improve the content of the error messages themselves, to assist in recognizing the cause and recovering from the error condition. These improvements will be a gradual process.

MarkSplunker
Explorer

Chris, thanks for your answer about error messages. I know some people are using Splunk to review sourcecode, so I'm sure you are doing something similar internally as well. Perhaps that will help pull out those pieces of code that identify error messages for amplification. We'll be interested to see the improvements you speak of.

Do you have any feedback on my more important question? Why is the very same rex query failing when used in a primary/subsearch context, but works fine when used alone in a single query statement?

0 Karma

MarkSplunker
Explorer

I tested the code you suggested and it is similar to what I started with originally. It does pull all records, both successes and failures, but it's not quite what I want. The subsearch is to first identify Devices associated with a particular TaskID that attempted an action and failed. Once we have that pool of devices, the primary search looks to see which of those devices subsequently ran with a new TaskID that did succeed. This will greatly reduce the events returned, and will provide the answer I need to the question: "which TaskID (a set of tests run) succeeded after a previous TaskID (different tests) had failed previously. Thanks again for your help.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think this should be a separate posting.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...