Getting Data In

Discarding events using TRANSFORMS-null

AaronAltonKinro
Path Finder

I'm trying to bring in Cisco CDR files for some very basic splunk searches. The standard CDR format has a header row, then a "datatype" row, then the actual data. So the first two rows look something like this:

"cdrRecordType","globalCallID_callManagerId","globalCallID_callId","origLegCallIdentifier","dateTimeOrigination","origNodeId","origSpan","origIpAddr","callingPartyNumber","callingPartyUnicodeLoginUserID","origCause_location","origCause_value","origPrecedenceLevel","origMediaTransportAddress_IP","origMediaTransportAddress_Port","origMediaCap_payloadCapability","origMediaCap_maxFramesPerPacket","origMediaCap_g723BitRate","origVideoCap_Codec","origVideoCap_Bandwidth","origVideoCap_Resolution","origVideoTransportAddress_IP","origVideoTransportAddress_Port","origRSVPAudioStat","origRSVPVideoStat","destLegIdentifier","destNodeId","destSpan","destIpAddr","originalCalledPartyNumber","finalCalledPartyNumber","finalCalledPartyUnicodeLoginUserID","destCause_location","destCause_value","destPrecedenceLevel","destMediaTransportAddress_IP","destMediaTransportAddress_Port","destMediaCap_payloadCapability","destMediaCap_maxFramesPerPacket","destMediaCap_g723BitRate","destVideoCap_Codec","destVideoCap_Bandwidth","destVideoCap_Resolution","destVideoTransportAddress_IP","destVideoTransportAddress_Port","destRSVPAudioStat","destRSVPVideoStat","dateTimeConnect","dateTimeDisconnect","lastRedirectDn","pkid","originalCalledPartyNumberPartition","callingPartyNumberPartition","finalCalledPartyNumberPartition","lastRedirectDnPartition","duration","origDeviceName","destDeviceName","origCallTerminationOnBehalfOf","destCallTerminationOnBehalfOf","origCalledPartyRedirectOnBehalfOf","lastRedirectRedirectOnBehalfOf","origCalledPartyRedirectReason","lastRedirectRedirectReason","destConversationId","globalCallId_ClusterID","joinOnBehalfOf","comment","authCodeDescription","authorizationLevel","clientMatterCode","origDTMFMethod","destDTMFMethod","callSecuredStatus","origConversationId","origMediaCap_Bandwidth","destMediaCap_Bandwidth","authorizationCodeValue","outpulsedCallingPartyNumber","outpulsedCalledPartyNumber","origIpv4v6Addr","destIpv4v6Addr","origVideoCap_Codec_Channel2","origVideoCap_Bandwidth_Channel2","origVideoCap_Resolution_Channel2","origVideoTransportAddress_IP_Channel2","origVideoTransportAddress_Port_Channel2","origVideoChannel_Role_Channel2","destVideoCap_Codec_Channel2","destVideoCap_Bandwidth_Channel2","destVideoCap_Resolution_Channel2","destVideoTransportAddress_IP_Channel2","destVideoTransportAddress_Port_Channel2","destVideoChannel_Role_Channel2","IncomingProtocolID","IncomingProtocolCallRef","OutgoingProtocolID","OutgoingProtocolCallRef","currentRoutingReason","origRoutingReason","lastRedirectingRoutingReason","huntPilotPartition","huntPilotDN","calledPartyPatternUsage","IncomingICID","IncomingOrigIOI","IncomingTermIOI","OutgoingICID","OutgoingOrigIOI","OutgoingTermIOI","outpulsedOriginalCalledPartyNumber","outpulsedLastRedirectingNumber","wasCallQueued","totalWaitTimeInQueue","callingPartyNumber_uri","originalCalledPartyNumber_uri","finalCalledPartyNumber_uri","lastRedirectDn_uri","mobileCallingPartyNumber","finalMobileCalledPartyNumber","origMobileDeviceName","destMobileDeviceName","origMobileCallDuration","destMobileCallDuration","mobileCallType","originalCalledPartyPattern","finalCalledPartyPattern","lastRedirectingPartyPattern","huntPilotPattern"
INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),VARCHAR(128),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(64),VARCHAR(64),INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),VARCHAR(50),VARCHAR(128),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(64),VARCHAR(64),INTEGER,INTEGER,VARCHAR(50),UNIQUEIDENTIFIER,VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50),INTEGER,VARCHAR(129),VARCHAR(129),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),INTEGER,VARCHAR(2048),VARCHAR(50),INTEGER,VARCHAR(32),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(32),VARCHAR(50),VARCHAR(50),VARCHAR(64),VARCHAR(64),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(32),INTEGER,VARCHAR(32),INTEGER,INTEGER,INTEGER,VARCHAR(50),VARCHAR(50),INTEGER,VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50),INTEGER,INTEGER,VARCHAR(255),VARCHAR(255),VARCHAR(255),VARCHAR(255),VARCHAR(50),VARCHAR(50),VARCHAR(129),VARCHAR(129),INTEGER,INTEGER,INTEGER,VARCHAR(50),VARCHAR(50),VARCHAR(50),VARCHAR(50)

I'm trying to discard that second row via the method listed on Splunk's "Route and Filter Data" article, but for some reason it isn't working (which is to say, the second row is being indexed). I suspect a problem with the regex in transforms.conf, but I'm really not sure. Here's what the relevant config files look like:

Inputs.conf:
[monitor://C:\Cisco_CDR\*\cdr*]
disabled = false
host_segment = 2
index = cisco_cdr
sourcetype = CiscoCDR

transforms.conf:
[setnull]
REGEX = ^INTEGER.*
DEST_KEY = queue
FORMAT = nullQueue

props.conf:
[CiscoCDR]
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = dateTimeOrigination
TIME_FORMAT = %s
category = Structured
description = Cisco Call Detail Record format
disabled = false
pulldown_type = true

[source::C:\Cisco_CDR\*\cdr*]

TRANSFORMS-null = setnull

Any help would be appreciated.

0 Karma
1 Solution

AaronAltonKinro
Path Finder

Well, this is pretty silly. Despite the fact that I had manually deleted and recreated the transforms.conf file more than once, the ACL on the file did not have an entry for "Local System". Splunk wasn't parsing the file at all.

I was thrown off by the fact that the btool diagnostics looked OK, but they were running under the context of my own user account (which was in the ACL).

Thanks to everyone for your help.

View solution in original post

AaronAltonKinro
Path Finder

Well, this is pretty silly. Despite the fact that I had manually deleted and recreated the transforms.conf file more than once, the ACL on the file did not have an entry for "Local System". Splunk wasn't parsing the file at all.

I was thrown off by the fact that the btool diagnostics looked OK, but they were running under the context of my own user account (which was in the ACL).

Thanks to everyone for your help.

maciep
Champion

Glad you were able to resolve and glad it was something silly...thanks for following up!

0 Karma

maciep
Champion

I think your regex is probably fine. And I'm assuming once indexed you verified that the line break works - meaning, the entire second row is one event?

If so, I'm guessing you're never getting to the stanza in transforms. And I'm wondering if it's because you need to double up on your backslashes in your source stanza. This is from the props.conf spec file on docs.

**Considerations for Windows file paths:**

When you specify Windows-based file paths as part of a [source::<source>]
stanza, you must escape any backslashes contained within the specified file
path.

Example: [source::c:\\path_to\\file.txt]

**[<spec>] stanza patterns:**

And you could also probably just call your transforms from the sourcetype stanza too. Not sure if you needed to call it in a source stanza or if that was just from following the docs article you referenced.

AaronAltonKinro
Path Finder

Oh, I didn't know it could be part of the sourcetype stanza. I tried that, too - but still no luck.

In short:

  • Confirmed that the event line breaks are working properly - the entire second row is in one event.
  • Modified props.conf to look like this:

[CiscoCDR]
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = dateTimeOrigination
TIME_FORMAT = %s
category = Structured
description = Cisco Call Detail Record format
disabled = false
pulldown_type = true
TRANSFORMS-null = setnull

Stopped splunk, cleaned the index, and started it. The second row is still getting indexed. Also tried escaping the backslashes, as you suggested - no difference. It leads me to believe that there may be something wrong with my regex after all, although it checks out just fine on regexr. I've tried the following:

REGEX = ^INTEGER.*
REGEX = INTEGER,INTEGER

(separately, of course)

0 Karma

maciep
Champion

So if you're still in the testing phase, maybe set REGEX to "." (no quotes). If everything gets discarded, then at least we know we made it to the stanza and it sent info to the null queue as expected. If not, then either the stanza in transforms is wrong or we never get there.

AaronAltonKinro
Path Finder

Good suggestion. Just tried that - it's not working, either (all events are still being indexed). Now, I know that the file is being parsed, because I accidentally used REGEX = * instead of REGEX = . the first time, and the configuration file check threw an error at startup.

Interesting, and perhaps related - I was also working to add a lookup stanza into transforms.conf. Here's the stanza:
[CDRCauseCodeLookup]
filename=causecodes.csv

And then in props, under the sourcetype definition:
lookup_table = CDRCauseCodeLookup cause_code AS destCause_value OUTPUT cause_description AS destCause_description

But when I restarted and launched the search, I got "The lookup table CDRCauseCodeLookup does not exist".

On a hunch, I rewrote my props.conf line as:
lookup_table = causecodes.csv cause_code AS destCause_value OUTPUT cause_description AS destCause_description

And it works. Could there be a similar issue with the [setnull] stanza?

0 Karma

maciep
Champion

Yeah, I feel like there's something odd happening here. Where are you creating your props and transforms? Could you include them both in their entirety?

And you have just the one server acting as the forwarder, indexer and search head? And all of the config is being done on that server?

AaronAltonKinro
Path Finder

They're being done in a custom app, so %SPLUNK_HOME%\etc\apps\MyApp\local. My "test box" is my laptop, so all roles are running on the same machine. I figured I'd get it working in that simple configuration before I figure out how to properly deploy it to our production environment via the deployment server.

Here's the full text of the files in local.

transforms.conf:

[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[CDRCauseCodeLookup]
filename=causecodes.csv

props.conf:

 [CiscoCDR]
    HEADER_FIELD_LINE_NUMBER = 1
    INDEXED_EXTRACTIONS = csv
    KV_MODE = none
    NO_BINARY_CHECK = true
    SHOULD_LINEMERGE = false
    TIMESTAMP_FIELDS = dateTimeOrigination
    TIME_FORMAT = %s
    category = Structured
    description = Cisco Call Detail Record format
    disabled = false
    pulldown_type = true
    EVAL-dateTimeConnect = if(dateTimeConnect=0,null,strftime(dateTimeConnect,"%m/%d/%y %H:%M:%S"))
    EVAL-dateTimeOrigination =  if(dateTimeOrigination=0,null,strftime(dateTimeOrigination,"%m/%d/%y %H:%M:%S"))
    eval-dateTimeDisconnect = if(dateTimeDisconnect=0,null,strftime(dateTimeDisconnect,"%m/%d/%y %H:%M:%S"))
    lookup_destCause = causecodes.csv cause_code AS destCause_value OUTPUT cause_description AS destCause_description
    lookup_origCause = causecodes.csv cause_code AS origCause_value OUTPUT cause_description AS origCause_description
    TRANSFORMS-null = setnull

inputs.conf:
    [monitor://C:\Cisco_CDR\*\cdr*]
    disabled = false
    host_segment = 2
    index = cisco_cdr
    sourcetype = CiscoCDR

There's also an app.conf file, but it seems to be pretty much empty:

[ui]

[launcher]
0 Karma

AaronAltonKinro
Path Finder

That inputs.conf is a separate file of course, but the formatting buggered up on me.

0 Karma

maciep
Champion

Nothing is jumping out at me here. Just to verify, you're lookups and evals from props actually work - you see them working in the web?

What if you run btool for just those stanzas in transforms?

splunk btool transforms list setnull --debug

It might be worth just trying to create a new transforms.conf and typing everything in again. I know that's a long shot, but why not at this point.

AaronAltonKinro
Path Finder

Yep, the props work as they should, and so do the lookups.

Here's the btool output. I recreated the file from scratch, and the btool output was an exact match:
c:\Program Files\Splunk\etc\apps\CiscoCDR\local\transforms.conf [setnull]
c:\Program Files\Splunk\etc\system\default\transforms.conf CAN_OPTIMIZE = True
c:\Program Files\Splunk\etc\system\default\transforms.conf CLEAN_KEYS = True
c:\Program Files\Splunk\etc\system\default\transforms.conf DEFAULT_VALUE =
c:\Program Files\Splunk\etc\apps\CiscoCDR\local\transforms.conf DEST_KEY = queue
c:\Program Files\Splunk\etc\apps\CiscoCDR\local\transforms.conf FORMAT = nullQueue
c:\Program Files\Splunk\etc\system\default\transforms.conf KEEP_EMPTY_VALS = False
c:\Program Files\Splunk\etc\system\default\transforms.conf LOOKAHEAD = 4096
c:\Program Files\Splunk\etc\system\default\transforms.conf MV_ADD = False
c:\Program Files\Splunk\etc\apps\CiscoCDR\local\transforms.conf REGEX = .
c:\Program Files\Splunk\etc\system\default\transforms.conf SOURCE_KEY = _raw
c:\Program Files\Splunk\etc\system\default\transforms.conf WRITE_META = False

Is this the kind of thing that support would look at? We do have an enterprise support contract, and it seems that you've helped make sure that the code is correct, and that this is looking more and more like a feature isn't working as designed.

0 Karma

maciep
Champion

yep, support would be able to help with that. That's probably the right place to go now.

Once you get this resolved, just be sure to update this question with the final answer.

AaronAltonKinro
Path Finder

Alright, thanks for your help maciep!

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try this in transforms.conf. Also I'm hoping the props and transforms.conf files are on Indexer/Heavy forwarder and you restart when they are changed.

transforms.conf:
[setnull]
REGEX = INTEGER,INTEGER
DEST_KEY = queue
FORMAT = nullQueue

AaronAltonKinro
Path Finder

Thanks for the suggestion. I just tried this, but unfortunately the result was no different.

I'm doing the work on a single-server test instance, and between each test I'm stopping Splunk, cleaning the index, and starting it back up again.

Do you know if it's possible to see when a given transforms.conf stanza is being invoked?

Not sure if it helps, but here's part of the output when I run >splunk cmd btool transforms list

.....
[setnull]
CAN_OPTIMIZE = True
CLEAN_KEYS = True
DEFAULT_VALUE =
DEST_KEY = queue
FORMAT = nullQueue
KEEP_EMPTY_VALS = False
LOOKAHEAD = 4096
MV_ADD = False
REGEX = INTEGER,INTEGER
SOURCE_KEY = _raw
WRITE_META = False
.....

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...