Splunk Search

Why is the wrong value being extracted when using this regular expression?

Communicator

Hi,

I am using a regular expression to extract the word that follows the string result of raw output. For endpoint 1 the captured value is "s"(incorrect) and for endpoint 2 the captured value is "OK" (correct).

Using Splunk Enterprise 6.5.1 build f74036626f0c, and the regex was generated using RegexBuddy (language PCRE2 10.21 - closest to splunk, and here the correct value is highlighted in both cases).

My inputs, props, transforms and raw output below. Would like some help on this, as i fail to understand from where "s" is captured.

inputs.conf

[rest://test]
source = test
auth_type = none
endpoint = http://localhost:8130/test/v1/statuscheck
http_method = GET
index = main
index_error_response_codes = 0
polling_interval = 60
request_timeout = 50
response_type = xml
sequential_mode = 0
sourcetype = url
streaming_request = 0

[rest://test2]
source = test2
auth_type = none
endpoint = http://localhost:8131/test/v1/statuscheck
http_method = GET
index = main
index_error_response_codes = 0
polling_interval = 60
request_timeout = 50
response_type = xml
sequential_mode = 0
sourcetype = url
streaming_request = 0

props.conf

[url]
category = Custom
pulldown_type = 1
disabled = false
TRANSFORMS-url = url_transformation

transforms.conf

[url_transformation]
REGEX = ^.+<result>(?<url_status>\w+).+
FORMAT = url_status::$1
WRITE_META = true

Endpoint 1 raw output :

curl http://localhost:8130/test/v1/statuscheck
<status>
        <result>OK</result>
        <resources/>

Endpoint 2 raw output :

curl http://localhost:8131/test/v1/statuscheck
<status>
<result>OK</result>
<resources><resource name="..." status="OK" /><resource name="..." status="OK" /></resources>
</status>
0 Karma
1 Solution

Communicator

Updated forwarder to 6.5.3 and I no longer have this issue.

View solution in original post

0 Karma

Communicator

Updated forwarder to 6.5.3 and I no longer have this issue.

View solution in original post

0 Karma

Esteemed Legend

Is this how you are testing?

1: Remove ALL the existing (bad) data from the indexers by running a search that pulls it in and piping it to the delete command on the search head.
2: Update your configurations on the Heavy Forwarder and restart splunk there.
3: Run a search and be sure that the data is "still gone".
4: Forward the data in.
5: Run the search again and see new data.

This is important because the configuration changes only effect NEWLY FORWARDED events.

0 Karma

Communicator

Whenever i clean the indexer of events , i turn off the indexer and clean data. Steps below :

on indexer :
1. splunk stop
2. splunk clean eventdata (Yes to clean all indexes)
3. splunk start

on search head
1. after indexer comes online
index=* OR index=_* -> 0 events as result

on forwarder
1. splunk restart

After a few minutes I run my search again and the events are in. The extracted field for endpoint 1 still shows the value for url_status as S .. but if i apply the regex directly in search .. it works..

Possible bug ?

0 Karma

Esteemed Legend

Unlikely. More likely it is a rogue configuration that is bypassing/undoing the fixed configuration that you are deploying. Look everywhere on your indexers and your forwarder for configurations which can create the field called url_status. On *NIX you can do this with this command:

find $SPLUNK_HOME -name "*.conf" -exec grep -l url_status {} \;

You can also use btool.
For example, if you have a broken configuration in $SPLUNK_HOME/etc/apps/MyApp/local but you are deploying your fixed configuration into $SPLUNK_HOME/etc/apps/MyApp/default, then the old/broken/local configuration will be overriding your fixed one. Something like this has to be happening.

0 Karma

Communicator

Followed your recommendation, and nothing 🙂

Forwarder :

test@Endpoint :~> cd /appserver/monitoring/splunk/
test@Endpoint :/appserver/monitoring/splunk> find ./ -name "*.conf" -exec grep -l url_status {} \;             
./etc/apps/url_check/local/transforms.conf

test@Endpoint :/appserver/monitoring/splunk> cat ./etc/apps/url_check/local/transforms.conf
[url_transformation]
REGEX = <result>(?<url_status>\w+).+
FORMAT = url_status::$1
WRITE_META = true

test@Endpoint :/appserver/monitoring/splunk> 

Indexer :

test@indexer:/appserver/monitoring/splunk> find ./ -name "*.conf" -exec grep -l url_status {} \;
test@indexer:/appserver/monitoring/splunk> 

Search Head :

test@search:/appserver/monitoring/splunk> find ./ -name "*.conf" -exec grep -l url_status {} \;
test@search:/appserver/monitoring/splunk> 

Is there any place the transformation process is logged ?

0 Karma

Esteemed Legend

Get rid of this and try again:

 WRITE_META = true
0 Karma

SplunkTrust
SplunkTrust

How about this

 REGEX = \<result\>(?<url_status>[^\<]+)\<\/result\>
0 Karma

Communicator

Works in search , but when put in transforms.conf and clearing my test environment data ,i get the same result... the value is "s" for endpoint 1.

0 Karma

SplunkTrust
SplunkTrust

So, you're keeping the configuration on indexer/heavy forwarder and restarting Splunk on that host?

0 Karma

Communicator

configuration is kept on heavy forwarder, cleaning data from indexer and restarting indexer to have fresh data.

0 Karma

Communicator

Forgot to mention that I am using Heavy Forwarders and the inputs,props and transforms files sit on the forwarder and not indexer.

0 Karma

Builder

try this regex in your transforms.conf

REGEX = <result>(?<url_status>\w+).+

Super Champion

using regex101.com , using .+<result>(?<url_status>\w+).+ seemed to work for both, the carrot at the beginning was throwing something off.

0 Karma

Communicator

using regex101.com i have the same result as with my regex. The correct value is selected, but when this is applied to transforms.conf , the result value for the field is still"s" . Tested your regex after clearing my test environment data , and i get the same result... the value is "s" for endpoint 1.

0 Karma