Splunk Search

Value in location field gets truncated when search is ran

amahesh3
New Member

Hi,

In my Splunk logs, I have a field called location which stores values like"
SINGAPORE (ABC)
WASHINGTON DC (ABC)
HONG KONG (ABC)
NEW YORK (ABC)
HO CHI MINH CITY VIETNAM (ABC)

But when I run a search |stats count by location the table which is displayed is:
SINGAPORE (ABC) 500
WASHINGTON 300
HONG 700
NEW 600
HO 300

As you can see every value except "SINGAPORE (ABC)" is automatically getting truncated as "HONG" or "NEW".
This also has an impact on my dashboard visualization bar chart.

But when I right-click on "NEW" and view events the logs which are displayed has the whole value "NEW YORK".

I request your help in correcting this issue.

Thanks.

0 Karma

diogofgm
SplunkTrust
SplunkTrust

A full example of your event could be handy. Depending on your full event data you can be a bit more precise with regex. You can use what ever precedes the location name and since you have parenthesis you can also use them as a boundary for your capture group

Example:
event text whatever pre location SINGAPORE (ABC) event text
event text other info pre location HO CHI MINH CITY VIETNAM (ABC) event text

Regex:
location\s+(?<location>[\w\s]+\([\w\s]+\))

Explanation:
Both names would be properly extracted since I bounded my capture group between "location" and a set of "( )" with whatever word and spaces inside. Whatever word composed by a-zA-Z0-9_ ( \w ) ou a blank character ( \s ) will be captured.
Live test here:
https://regex101.com/r/5lMFCJ/1

Hope this helps!

------------
Hope I was able to help you. If so, some karma would be appreciated.
0 Karma

jnudell_2
Builder

Hello @amahesh3 ,

Your field extraction is not created properly, because it does not appear to take into account locations with spaces in the name. You need to provide an example of a some events with locations with spaces in the name, your current extraction configuration and then someone can assist with the proper replacement for the field extraction.

Hope this helps.

amahesh3
New Member

Hi,
Can you please advise on how I can check the field extraction configuration ?

I tried searching around and came across this
(?i)^(?:[^ ]* ){2}(?:[+-]\d+ )?(?P[^ ]*)\s+(?P[^ ]+) - (?P.+)

Please let me know if this is correct and also explain to me how it is accommodating the space in "SINGAPORE (ABC)" and not the space in other location names

0 Karma

keith_d
Explorer

First things first... The regular expression you pasted won't look right to anyone looking at it here because it got eaten by the site's comment formatting engine. To paste anything with unusual characters like stars or greater than or less than symbols in their original, unaltered form, you'll need to surround them with code tags like this:



(?i)^(?:[^ ] ){2}(?:[+-]\d+ )?(?P[^ ])\s+(?P[^ ]+) - (?P.+)


And then every character will appear exactly as it actually is at your end for other viewers, like this:

(?i)^(?:[^ ] *){2}(?:[+-]\d+ )?(?P[^ ])\s+(?P[^ ]+) - (?P.+)
(Neither of my examples here probably match your real regex, because your version didn't survive the site's formatting engine and I can't reliably guess what the correct regex actually looks like.)

Now, on to your issue.

Purely speculation, but I see in your regular expression above that it contains a {2} which means to look for the previous token "exactly two times". Look at the below:

New York (ABC)
1   2    3
Washington DC (ABC)
1          2  3
Singapore (ABC)
1         2
Hong Kong (ABC)
1    2    3
HO CHI MINH CITY VIETNAM (ABC)
1  2   3    4    5       6

What I'm guessing is your actual regex which matches "Singapore (ABC) " would not match "New York (ABC) ", or any of your other examples, because those others are a string containing non-space characters followed by a space character three or more times, instead of exactly two times.

That could be the problem if you let Splunk create the regex for the field extractions and the sample events you selected didn't happen include any locations with more spaces in the location names, Splunk may have done this without you realizing it because it generally tries to be as specific as possible based on your sample events when it creates the extraction regexes for you.

This may or may not solve the issue for you (I can't know without seeing the actual raw events in their actual format and without knowing your actual unaltered regex, but you could try changing the {2} in the regex you found to {2,} instead (adding the comma without another number after means "match the previous token 2 or more times" instead of just exactly two times as it currently does without the comma. In regular expressions {n,n} specifies a range of how many times the previous token should match. So for example if you wanted to match at least 3 but not more than 7 times, you would have {3,7}. Having the comma with only the first or second number means basically:


{5} - this is the same as "exactly", or "exactly 5 times", or =5
{5,} - this is the same as "equal to or greater than", or "5 or more times", or >=5
{,5} - this is the same as "less than or equal to", or "5 or fewer times", or <=5
{3,5} - this is the same as "from..to", or "3 to 5 times", or ">=3 and <=5"

0 Karma

amahesh3
New Member

If what you are saying is true, then I should be getting location like
WASHINGTON DC
HO CHI
HONG KONG
NEW YORK

I should be getting 2 words of each location right ?

0 Karma

keith_d
Explorer

That's correct, but one of those "words" is your "(ABC)", so you will only get at most one name for each location based on what I can see and make out of your regex.

Edit: Actually, I just realized that what you're saying is correct, so in that case, I'm not sure what's going on. We'll need some sample raw events to compare with (if there's anything private/sensitive in them, just alter those items but keep the same formatting, i.e., upper case letters stay upper case, lower case letters stay lower case, numbers stay numbers, punctuation stays punctuation - and preferably the same punctuation so the regexes remain clear and answers can be more accurate.)

0 Karma

jnudell_2
Builder

You still have not provided an example of a full event. When you do I can provide you a solution for your issue. If it contains sensitive information, just change the values, but keep the formatting.

0 Karma

kmorris_splunk
Splunk Employee
Splunk Employee

Looks like the extraction is not accounting for spaces. Is this an automatic extraction or is it something you created?

0 Karma

amahesh3
New Member

Hi, I have not created any extraction it is happening automatically.
Also, The issue is not happening with SINGAPORE (ABC) which also has a space in between

0 Karma

richgalloway
SplunkTrust
SplunkTrust

What are the props.conf settings for that sourcetype?

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...