Splunk Search

How to extract area codes from phone number field?

jhillenburg
Path Finder

Hi. I have a series of systems (contact center, fax, Cisco CUCM, etc) where phone numbers are returned in the data. The phone numbers can be represented by:

  1. 5-digit extension (ie. x22101)
  2. 10-digit NANP number (i.e. 3125551212)
  3. 11-digit NANP number (ie. 13125551212)
  4. E.164 number (+13125551212)

For the moment, I am not concerned about #1 or #4, but I would like to do two things: Match only callers that have 10 or 11-digit phone numbers, strip the country code "1" when it shows up, and then leave only the area code for area code mapping.

Has anyone done this?

0 Karma
1 Solution

jayannah
Builder

You know the country code is always 1 (if present)

                            1?(?<ph_num>\d{10})  ==> This should return 10 digit irrespective whether country code 1 is prefix or not.
                            \d*(?<ph_num>\d{10})  ==> This return last 10 digits from phone number irrespective of whether the country code is 1 or 22 or 100 or not prefixed.

View solution in original post

0 Karma

masonmorales
Influencer

If you are only after the area code for those two scenarios, try this:

rex "1?(?<area_code>\d{3})"
0 Karma

jhillenburg
Path Finder

This is great, except that it matches 5-digit extensions beginning with "1".

0 Karma

masonmorales
Influencer

What if you changed it to:

 rex "1?(?<area_code>\d{3})\d{6}"
0 Karma

jayannah
Builder

You know the country code is always 1 (if present)

                            1?(?<ph_num>\d{10})  ==> This should return 10 digit irrespective whether country code 1 is prefix or not.
                            \d*(?<ph_num>\d{10})  ==> This return last 10 digits from phone number irrespective of whether the country code is 1 or 22 or 100 or not prefixed.

View solution in original post

0 Karma

jhillenburg
Path Finder

Thanks. This seems to be helpful in the same manner as the response from KindaWorking,, above.

0 Karma

KindaWorking
Path Finder

If I am understanding what you are after. This is the regex you can use for the field extractor.

 1?(?<NANP>\d{10}\d?)
0 Karma

jhillenburg
Path Finder

What I wanted to do was to pull out the area code. I think another responder below has gotten me a lot closer. However, what you have done has solved a related problem. That, I have one system where the logs easily provide me with the telephone number, but another one where I needed to perform field extract, and your regexp might be helpful. Thanks.

0 Karma

jrmurray
Explorer

Try a calculated field.

Settings -> Fields -> Calculated Fields

if(len(phone_number)==10 or len(phone_number)==11,replace(phone_number, "^1?(\d{3})\d{7}$","\1"))

You will have to associate it with your target source, sourcetype, or host, as well as the target app. Might have to mess with the regular expression and matched field length to include the + sign.

--
J.R. Murray
MetaNet IVS

0 Karma

jhillenburg
Path Finder

I'm trying to determine what this gets me that the other responses don't. Thanks for the response.

0 Karma

jrmurray
Explorer

This basically says if the phone number is 10 or 11 digits, perform the regular expression match and preserve only the digits you are looking for. In your example, a field extractor would probably work just as well, but this gives you a little more flexibility I think. On second glance, it looks like a character was missing from my initial response, so try this instead:

if(len(phone_number)>=10 and len(phone_number)<=12,replace(phone_number, "^\+?1?(\d{3})\d{7}$","\1"))

This will take care of handling the + operator as well. The field extractor version of this regular expression would be something like:

\+?1?(?P<area_code>\d{3})\d{7}
0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!