Getting Data In

CSV with custom quoting

anthonysomerset
Path Finder

Hi There

i have a CSV/UDR without headers with following example rows

session_start,0    ,0    ,2017-03-07 20:00:50 +0200     ,                              ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0
usage_start  ,0    ,0    ,2017-03-07 20:25:37 +0200     ,2017-03-07 20:25:47 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [27770],     20549,     18187,      2362,         0
usage_int    ,0    ,3    ,2017-03-07 20:33:05 +0200     ,2017-03-07 20:41:54 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,   [15457],     54450,     36051,     18399,         0
usage_stop   ,0    ,5    ,2017-03-07 20:46:23 +0200     ,2017-03-07 20:46:23 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,    [6322],         0,         0,         0,         0
session_stop ,0    ,59   ,2017-03-07 20:00:50 +0200     ,2017-03-07 20:59:32 +0200     ,172.99.99.0~86056588,labtest             ,172.99.99.0       ,       [0],         0,         0,         0,         0

currently i have a props that looks like this:

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
INDEXED_EXTRACTIONS = CSV
FIELD_NAMES = RecordType,RecordStatus,RecordNumber,StartTime,EndTime,AcctSessionId,SubscriberId,FramedIp,[ServiceId],TotalBytes,RxBytes,TxBytes,Time
TZ = Africa/Harare

Unfortunately the ServiceID field is encapsulated in [] braces and i just need the id without the braces - how can i change my props to extract the field at index time without the braces?

0 Karma
1 Solution

anthonysomerset
Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)

View solution in original post

0 Karma

anthonysomerset
Path Finder

So i tried @cusello approach of the SEDCMD but it didn't do what i wanted it to 😞

i Ended up down the regex extraction route mainly because of all the additional spaces used for padding - for the benefit of others my final props config was

[sde_rg_udr]
SHOULD_LINEMERGE=false
KV_MODE = NONE
TIME_FORMAT = %Y-%m-%d %H:%M:%S %z
TIME_PREFIX=(?:.*?,){3}
MAX_TIMESTAMP_LOOKAHEAD = 26
EXTRACT-sde_rg_udr = ^(?P<RecordType>[^,]*)\s*,(?P<RecordStatus>[^,]*)\s*,(?P<RecordNumber>[^,]*)\s*,(?P<StartTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<EndTime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s\+\d{4})\s*,(?P<AcctSessionId>[^,]*)\s*,(?P<SubscriberId>[^,]*)\s*,(?P<FramedIp>[^,]*)\s*,\s*\[(?P<ServiceId>[^,]*)\],\s*(?P<TotalBytes>[^,]*),\s*(?P<RxBytes>[^,]*),\s*(?P<TxBytes>[^,]*),\s*(?P<Time>[^,]*)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi anthonysomerset,
try to insert in your props.conf file the following line to change ",[" and "]," with commas:

SEDCMD-drop1 = s/,\[/,/g
SEDCMD-drop2 = s/\],/,/g

Bye.
Giuseppe

0 Karma

anthonysomerset
Path Finder

So i tried this approach and it did the job in the raw event data however the field still had the braces in 😞

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Maximizing the Value of Splunk ES 8.x

Splunk Enterprise Security (ES) continues to be a leader in the Gartner Magic Quadrant, reflecting its pivotal ...