Splunk Search

How to edit my regex to extract fields from an imported CSV file with extra commas in some rows?

banderson7
Communicator

So this is fun...
I need to import a CSV on a regular basis, and I have no control over the format or data in the csv. Contents include:

name1-name2-Uptime,N. California [RealBrowser],8419,100,0,23
name1-name2-Uptime,N. Virginia [RealBrowser],10062,100,0,25
name1-name2-Uptime,Chicago, IL [RealBrowser],6882,95.83300018,1,24
name1-name2-Uptime,Dallas, TX [RealBrowser],9558,100,0,23

Some of the second columns have commas, but not all, and my regex is getting thrown by them.

I've tried the following regex, but it combines the second, third and fourth columns.

^(?<check_name>[^,]+),(?<location>.*),(?<avg_response_time>[^,]+)
0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

^(?<check_name>[^,]+),(?<location>.*),(?<num1>[^,]*),(?<num2>[^,]*),(?<num3>[^,]*),(?<num4>[^,]*)$

View solution in original post

landen99
Motivator

From the data I am seeing, the fields are not separated by commas and need regex that looks for things other than commas:

 name1-name2-Uptime    N. California [RealBrowser]    8419    100    0    23
 name1-name2-Uptime    N. Virginia [RealBrowser]    10062    100    0    25
 name1-name2-Uptime    Chicago, IL [RealBrowser]    6882    95.83300018    1    24
 name1-name2-Uptime    Dallas, TX [RealBrowser]    9558    100    0    23

Based on the form of your regex for the value patterns for each field, I suggest this regex:

(?<check_name>[A-Za-z]\S+)\s+(?<location>[^\]]+\])\s+(?<avg_response_time>\d+)

But I recommend breaking the fields out a little more:

(?<check_name1>[A-Za-z]\w+)\-(?<check_name2>[A-Za-z]\w+)\-(?<state>\w+)\s+(?<location>[^\[]+) \[(?<browser>[^\]]+)\]\s+(?<avg_response_time>\d+)

https://regex101.com/r/zK5yT5/1

0 Karma

woodcock
Esteemed Legend

Like this:

^(?<check_name>[^,]+),(?<location>.*),(?<num1>[^,]*),(?<num2>[^,]*),(?<num3>[^,]*),(?<num4>[^,]*)$

banderson7
Communicator

That did it!
What's the difference between the first field and the rest? What's the * do that the + didn't?

0 Karma

landen99
Motivator

"*" is zero or more allowing the possibility of no matches for the pattern. But your data lacks the commas for those regex to work so I don't think it should work. As if you and Gregg are seeing different data than what myself and others are seeing. I'll post mine below, just in case.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Is your sample data before or after it's been indexed by Splunk? If the latter, please post the data in it's original format. I ask because your regex is expecting a comma between fields, but the data has white space between fields.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...