Splunk Search

multiple headers

adomila
Explorer

Hi,
I have a couple of comma separated cisco log files which is suppose to have different set of headers or fields. The said log files have common fields like so:

header#1: Timestamp,RDR_ID,SUBSCRIBER_ID,CLIENT_IP
header#2: Timestamp,RDR_ID,SUBSCRIBER_ID,SKIPPED_SESSIONS,CLIENT_IP

sample data#1:
1361171830137,4042321984,001ffb25b1d1@smartbro.net,192.168.1.1
1361171830473,4042321984,001ffb0f90bb@smartbro.net,192.168.1.2
1361171831107,4042321984,001ffb0f90bb@smartbro.net,192.168.1.3

sample data#2
1361171830137,4042323000,001ffb25b1d1@smartbro.net,0,192.168.1.1
1361171830473,4042323000,001ffb0f90bb@smartbro.net,1,192.168.1.2
1361171831107,4042323000,001ffb0f90bb@smartbro.net,0.192.168.1.3

my props.conf
[smart_sce_sourcetype]
REPORTS-multi = Transaction_Usage_RDR, Block_RDR

my transforms.conf
[Transaction_Usage_RDR]
REGEX="\W4042323000,"
DELIMS=","
FIELDS="TIMESTAMP","RDR_ID","SUBSCRIBER_ID","CLIENT_IP"

[Block_RDR]
REGEX="\W4042321984,"
DELIMS=","
FIELDS="TIMESTAMP","RDR_ID","SUBSCRIBER_ID","SKIPPED_SESSIONS","CLIENT_IP"

The RDR_ID(2nd column of the actual data) determines w/c header to use. You'll notice this on my regex. The 2 sample data are indexed and both headers are generated but client_ip data is going on the skipped_sessions. Also some of the columns are missing. I removed the other headers for briefness of presenting the problem. Generally speaking the indexed data is messed up. Kindly advice.

Tags (1)
0 Karma

Ayn
Legend

If I understand you correctly you're trying to create a conditional extraction so when a line matches one regex, one delims-based extraction will be applied and if it matches the other regex the other extraction will be used. It doesn't work that way. (for good reasons - which one would Splunk decide to use if both regexes match?)

You can only define one delims-based extraction at a time, so given one sourcetype you can't have multiple extractions like that. What you could do is create two regex-based extractions instead that do the same thing:

[Transaction_Usage_RDR]
REGEX = ^([^,]+),([^,]+),([^,]+),([^,]+)$
FORMAT = TIMESTAMP::$1 RDR_ID::$2 SUBSCRIBER_ID::$3 CLIENT_IP::$4

[Block_RDR]
REGEX = ^([^,]+),([^,]+)([^,]+)([^,]+)([^,]+)$
FORMAT = TIMESTAMP::$1 RDR_ID::$2 SUBSCRIBER_ID::$3 SKIPPED_SESSIONS:$4 CLIENT_IP::$5
0 Karma

Ayn
Legend

Just replace the commas with pipes - but you need to escape the pipes ("\|") because pipes are special characters in regular expressions.

0 Karma

adomila
Explorer

Finally got it working. Many many thanks Ayn ^_^

Just a follow up question, but this is with regards to another project, similar in nature; if the data delimiter is a pipe(not a comma) character like so "|" then I would need to replace the second comma with the said pipe character, i.e. ([^,])|([^,])|([^,]*| ...). So sorry, my regex know-how is a bit messy. tia(tnx in advance)

0 Karma

Ayn
Legend

You could just change the + sign to *. + means "1 or more of the preceding" whereas * means "0 or more of the preceding" so if there's no match at all it should work fine anyway.

0 Karma

adomila
Explorer

Hi,
This works, except when it encounters a blank(not space or not whitespace) just the comma(or null) like so ,,, data, it will not work. It will still index but some of the fields although not blank will be affected and will not be index if it falls on the same column with blank data 😞 I tried to include |(OR regex char) then \S on the regex but still not working. Kindly advice.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...