Splunk Search

How to edit my regular expression to extract a field and trim out strings with more than X characters (except space) from the value?

pradjswl
Explorer

I have the following events.

event 1)

[08-09-2016_08:00:40.567_PDT] [ERROR] - [ePdv0XVRu2] [xxx@yyy.com] [] [auth] [ResourceAuthenticationFilter] - TATS_SS_TOKEN_ID TOKEN IN SESSION GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNpZ6OE9aZsaSXTYSaIno4RHG8qzwNMtvdykNJLIFCGFAj6Fdt7k8A3%2BSTYY5aircTcONh0u8GOPNuVWCFFc3WUQ DID NOT MATCH WITH COOKIE GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNrTdaX1vVAhzrXBszldYtE5cEm9yffwuivWl6DpoobEqpZnTtfrVa3CEJ7uHqPv%2B1aj9K%2BaJz%2B%2Bc376kG5%2FJcNn PRESEN

event 2)

[08-09-2016_08:00:41.451_PDT] [ERROR] - [ePdv0XVRu2] [xxx@yyy.com [] [unauth] [ResourceReqValidationFilter] - Not Authorized TO Access this URI https:zzz.com

I am using this regular expression, and extracted the Error description.

(?:\].*?){7}\s-\s(?P.*) 

The field a_xf_ErrorDescription returns a very large value for the1st event as you can see that it has cookie related information. In reality, there can't be readable format of English words continuing to more than 10-15 character(except space).

As per current regular expression

a_xf_ErrorDescription=TATS_SS_TOKEN_ID TOKEN IN SESSION GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNpZ6OE9aZsaSXTYSaIno4RHG8qzwNMtvdykNJLIFCGFAj6Fdt7k8A3%2BSTYY5aircTcONh0u8GOPNuVWCFFc3WUQ DID NOT MATCH WITH COOKIE GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNrTdaX1vVAhzrXBszldYtE5cEm9yffwuivWl6DpoobEqpZnTtfrVa3CEJ7uHqPv%2B1aj9K%2BaJz%2B%2Bc376kG5%2FJcNn PRESEN

Question 1) Is there a way for a field extraction to STOP & IGNORE a word which has more than 15 (or 20) characters ? So that the extracted field for event 1 would just have the value as:

a_xf_ErrorDescription=TATS_SS_TOKEN_ID TOKEN IN SESSION

Question 2) Is there a way for the field extraction to CONTINUE & IGNORE the word which has more than 15 (or 20) characters so that the extracted field for event 1 would have the value as:

a_xf_ErrorDescription=TATS_SS_TOKEN_ID TOKEN IN SESSION DID NOT MATCH WITH COOKIE PRESEN

The reason I want to Trim the extracted field to meaningful name so that it's easier to create a timechart with the field having common error.

Thanks for your feedback.

0 Karma
1 Solution

sundareshr
Legend

Try this run anywhere example

| makeresults | eval a_xf_ErrorDescription="TATS_SS_TOKEN_ID TOKEN IN SESSION GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNpZ6OE9aZsaSXTYSaIno4RHG8qzwNMtvdykNJLIFCGFAj6Fdt7k8A3%2BSTYY5aircTcONh0u8GOPNuVWCFFc3WUQ DID NOT MATCH WITH COOKIE GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNrTdaX1vVAhzrXBszldYtE5cEm9yffwuivWl6DpoobEqpZnTtfrVa3CEJ7uHqPv%2B1aj9K%2BaJz%2B%2Bc376kG5%2FJcNn PRESEN" | rex field=a_xf_ErrorDescription max_match=0 "\s(?<words>\w{1,10})\s?" | table words | nomv words

View solution in original post

0 Karma

sundareshr
Legend

Try this run anywhere example

| makeresults | eval a_xf_ErrorDescription="TATS_SS_TOKEN_ID TOKEN IN SESSION GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNpZ6OE9aZsaSXTYSaIno4RHG8qzwNMtvdykNJLIFCGFAj6Fdt7k8A3%2BSTYY5aircTcONh0u8GOPNuVWCFFc3WUQ DID NOT MATCH WITH COOKIE GVtghrUaE%2FIU5H8Lpa%2FcfAhIZvdT7Q1Q%2F4UL3zgnngrOrL97eUYn5e0j8sXk5eN6%2FSQEsVAz066qk%2F1KanQjxreAL%2F4qAbPs5C6K9ZVKWAPENBF%2BC3k0nSDcXFTYw4Ep%2BvAt9HwFbCN9eg1Xj8qG9KfLa0Is%2B9YeGmEiYAH4MQoBmH6Zx6Y%2FStxOMNwsvySruKmdlnMpXeFLrPWbd6iVrCmCvOzIZZaNtyq9trGUAxHaTGbQxTkE8clMWcvUhenkhWxijr2%2F%2FnASvxU9rIrfgkV%2Bnirw2kLKZWf%2BW1e5nNrTdaX1vVAhzrXBszldYtE5cEm9yffwuivWl6DpoobEqpZnTtfrVa3CEJ7uHqPv%2B1aj9K%2BaJz%2B%2Bc376kG5%2FJcNn PRESEN" | rex field=a_xf_ErrorDescription max_match=0 "\s(?<words>\w{1,10})\s?" | table words | nomv words
0 Karma

pradjswl
Explorer

Thank you @ sundareshr . If I pipe my orignal query with the makeresults , I am getting following error.
Error in 'makeresults' command: This command must be the first command of a search.

Where do i specify the sourcetype and other part of the search criteria ?

0 Karma

somesoni2
Revered Legend

Remove everything before | rex field=a... and replace it with your original query

sundareshr
Legend

Like this

your base search | rex field=a_xf_ErrorDescription max_match=0 "\s(?<words>\w{1,10})\s?" | table words | nomv words

pradjswl
Explorer

what does \w{1,10} do? Does it ignore any word of minimum 1 to maximum 10 characters?

0 Karma

sundareshr
Legend

It captures between 1 & 10 characters. I assume the longest word will be 10 characters and cookie wil be greater than that. You can increase/reduce the 10. Keep the 1

@pradjswl if this works, please accept the answer to close it out.

pradjswl
Explorer

@sundareshr - Done. Sorry I didnt knew about Answer accepting. Just being new to site 🙂

0 Karma

pradjswl
Explorer

great ty @somesoni2 & @sundareshr

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...