Getting Data In

CSV with custom quoting

anthonysomerset
Path Finder

Hi There

i have a CSV/UDR without headers with following example rows

session_start,0    ,0    ,2017-03-07 20:00:50 +0200     ,                              ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0
usage_start  ,0    ,0    ,2017-03-07 20:25:37 +0200     ,2017-03-07 20:25:47 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [27770],     20549,     18187,      2362,         0
usage_int    ,0    ,3    ,2017-03-07 20:33:05 +0200     ,2017-03-07 20:41:54 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [15457],     54450,     36051,     18399,         0
usage_stop   ,0    ,5    ,2017-03-07 20:46:23 +0200     ,2017-03-07 20:46:23 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,    [6322],         0,         0,         0,         0
session_stop ,0    ,59   ,2017-03-07 20:00:50 +0200     ,2017-03-07 20:59:32 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0

currently i have a props that looks like this:

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
INDEXED_EXTRACTIONS = CSV
FIELD_NAMES = RecordType,RecordStatus,RecordNumber,StartTime,EndTime,AcctSessionId,SubscriberId,FramedIp,[ServiceId],TotalBytes,RxBytes,TxBytes,Time
TZ = Africa/Harare

Unfortunately the ServiceID field is encapsulated in [] braces and i just need the id without the braces - how can i change my props to extract the field at index time without the braces?

0 Karma
1 Solution

anthonysomerset
Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)

View solution in original post

0 Karma

anthonysomerset
Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi anthonysomerset,
try to insert in your props.conf file the following line to change ",[" and "]," with commas:

SEDCMD-drop1 = s/,\[/,/g
SEDCMD-drop2 = s/\],/,/g

Bye.
Giuseppe

0 Karma

anthonysomerset
Path Finder

So i tried this approach and it did the job in the raw event data however the field still had the braces in 😞

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...