multiple headers

adomila · ‎03-04-2013

Hi,
I have a couple of comma separated cisco log files which is suppose to have different set of headers or fields. The said log files have common fields like so:

header#1: Timestamp,RDR_ID,SUBSCRIBER_ID,CLIENT_IP
header#2: Timestamp,RDR_ID,SUBSCRIBER_ID,SKIPPED_SESSIONS,CLIENT_IP

sample data#1:
1361171830137,4042321984,001ffb25b1d1@smartbro.net,192.168.1.1
1361171830473,4042321984,001ffb0f90bb@smartbro.net,192.168.1.2
1361171831107,4042321984,001ffb0f90bb@smartbro.net,192.168.1.3

sample data#2
1361171830137,4042323000,001ffb25b1d1@smartbro.net,0,192.168.1.1
1361171830473,4042323000,001ffb0f90bb@smartbro.net,1,192.168.1.2
1361171831107,4042323000,001ffb0f90bb@smartbro.net,0.192.168.1.3

my props.conf
[smart_sce_sourcetype]
REPORTS-multi = Transaction_Usage_RDR, Block_RDR

my transforms.conf
[Transaction_Usage_RDR]
REGEX="\W4042323000,"
DELIMS=","
FIELDS="TIMESTAMP","RDR_ID","SUBSCRIBER_ID","CLIENT_IP"

[Block_RDR]
REGEX="\W4042321984,"
DELIMS=","
FIELDS="TIMESTAMP","RDR_ID","SUBSCRIBER_ID","SKIPPED_SESSIONS","CLIENT_IP"

The RDR_ID(2nd column of the actual data) determines w/c header to use. You'll notice this on my regex. The 2 sample data are indexed and both headers are generated but client_ip data is going on the skipped_sessions. Also some of the columns are missing. I removed the other headers for briefness of presenting the problem. Generally speaking the indexed data is messed up. Kindly advice.

Ayn · ‎03-04-2013

If I understand you correctly you're trying to create a conditional extraction so when a line matches one regex, one delims-based extraction will be applied and if it matches the other regex the other extraction will be used. It doesn't work that way. (for good reasons - which one would Splunk decide to use if both regexes match?)

You can only define one delims-based extraction at a time, so given one sourcetype you can't have multiple extractions like that. What you could do is create two regex-based extractions instead that do the same thing:

[Transaction_Usage_RDR]
REGEX = ^([^,]+),([^,]+),([^,]+),([^,]+)$
FORMAT = TIMESTAMP::$1 RDR_ID::$2 SUBSCRIBER_ID::$3 CLIENT_IP::$4

[Block_RDR]
REGEX = ^([^,]+),([^,]+)([^,]+)([^,]+)([^,]+)$
FORMAT = TIMESTAMP::$1 RDR_ID::$2 SUBSCRIBER_ID::$3 SKIPPED_SESSIONS:$4 CLIENT_IP::$5

Ayn · ‎03-11-2013

Just replace the commas with pipes - but you need to escape the pipes ("\|") because pipes are special characters in regular expressions.

adomila · ‎03-10-2013

Finally got it working. Many many thanks Ayn ^_^

Just a follow up question, but this is with regards to another project, similar in nature; if the data delimiter is a pipe(not a comma) character like so "|" then I would need to replace the second comma with the said pipe character, i.e. ([^,])|([^,])|([^,]*| ...). So sorry, my regex know-how is a bit messy. tia(tnx in advance)

Ayn · ‎03-05-2013

You could just change the + sign to *. + means "1 or more of the preceding" whereas * means "0 or more of the preceding" so if there's no match at all it should work fine anyway.

adomila · ‎03-04-2013

Hi,
This works, except when it encounters a blank(not space or not whitespace) just the comma(or null) like so ,,, data, it will not work. It will still index but some of the fields although not blank will be affected and will not be index if it falls on the same column with blank data 😞 I tried to include |(OR regex char) then \S on the regex but still not working. Kindly advice.

multiple headers

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

multiple headers

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...