Splunk Search

Field Extraction recommendations for Avaya call log

New Member

I'm trying to do a field extraction for an Avaya call log. With this particular log event, every character, including spaces, are significant. Therefore this entry:

#000#000#000031419      102900107    8*902    17145550104     71234            88888 0 0244 71247 0 0000101 #015

The characters (17145550104 71234) are in the characters in positions 33-54 and indicates that a call came in from 1-714-555-0104 and went to the exchange 7-1234. This would be 11 characters for the originating number and 10 characters for the destination number; in this case the first five characters of the destination number are spaces.

A similar entry:

#000#000#000031419      132500329     #060              713312155555331         50042424     25331    0    0000316    #015 

In this case the field 713312155555331 contains the originating number, 7-1331 prepended by 6 spaces and a destination number of 215-555-5331, and is again in the positions 33-54.

Any suggestions/tips on how to do this sort of field extraction would be greatly appreciated.

Thanks,
Mike

0 Karma

Ultra Champion

Hi @dahlberg

I Started with this, but If you know the field names maybe you could post them and I'll update my answer, but for now I have just called them a,b,c etc.
https://regex101.com/r/a9Tm6c/1

^(?P<a>[^\s]+)\s+(?P<b>[^\s]+)\s+(?P<c>[^\s]+)\s+(?P<d>[^\s]+)\s+(?P<e>[^\s]+)\s+(?P<f>[^\s]+)?\s+(?P<g>[^\s]+)?\s+(?P<h>[^\s]+)?\s+(?P<i>[^\s]+)\s+(?P<j>[^\s]+)\s+(?P<k>[^\s]+)\s+(?P<l>[^\s]+)

Obviously, not all fields may be present in all logs, so you may need to fiddle the the option flag ?

0 Karma

Revered Legend
0 Karma

New Member

Yea, you see my problem! This regex correctly returns both the origination number and the destination in the first event. However, with the second event, the regex incorrectly separates the fields because there is no space between the two numbers.

Mike

0 Karma

Revered Legend

I think the character index you provided are off, considering your data. Do you have clear number about at what position the call data starts, what is exact lenght of those fields (including spaces) etc? If you've that, your field extraction would look like this

^.{N1}(?<originating>.{N2})(?<destination>.{N3})

Where, N1 is the no of characters before value of originating appears in your raw data, N2 is length of originating number (e.g. 11) and N3 is total length of destination number.

0 Karma

New Member

Thanks for taking a look at this.

At position #13 the date starts which is 6 chars then
spaces - 6 chars
time - 4 chars
duration - 4 chars
condition code - 1 char
code-dial - 4 chars
code used - 4 chars
dialed number - 15 chars
calling number - 10 chars

Therefore at position 23 from the start of the event, the originating number is returned. If it is an exchange then it is prepended with spaces. At position 38 the calling number starts and if it is an exchange it is prepended with spaces. As a result, if a call originates from an exchange and and goes to an outside number, there will be one field. If a call comes in from an outside number and goes to an exchange, it will look like two fields, since the second number is prepended by spaces.

Thanks.

Mike

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!