- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Importing CSV files with numbers formatted in German localization (ex: 1.3400,39), how to configured SEDCMD in props.conf to switch commas and decimal points?
Hey everybody,
We recently got the request to import CSV files into Splunk. However, the files include some number formatted with German localizations, such that decimal points and commas are exchanged. This is a common number then: 1.340.000,39
. Such a number would now be interpreted as String in Splunk.
I saw some guys having similar problems, but not a solution without fixing the problems in search.
http://answers.splunk.com/answers/52965/splunk-and-localization-of-numbers.html
http://answers.splunk.com/answers/42814/decimal-how-to-use-vs.html
I tried my best and implemented some SEDCMD to remove points in numbers and replace commas with points, but it seems like they are getting executed on _raw after the CSV-fields were already excluded.
Is there a way to move the SEDCMDs before CSV-extraction or to make Splunk aware of different-localized numbers?
My props.conf is:
[csv-test]
SEDCMD-1-removePoints = s/([0-9])\.([0-9])/\1\2/g
SEDCMD-2-decimalToPoint = s/([0-9])\,([0-9])/\1.\2/g
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
CHARSET = Latin-1
Thanks for your kind help,
Benjamin
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Try removing INDEXED_EXTRACTIONS
from your props.conf
and field extraction rules to props
& transforms
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is worth a try.
In that case, however, I need to provide the column headings of my csv for Splunk do recognize the fields, right?
That was my intention to use INDEXED_EXTRACTIONS.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

If you choose to removed INDEXED_EXTRACTIONS
you will also have to remove the "header" row from _raw and do `multiple field extraction. See here for details/ideas
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the Link, sundareshr. My approach would be something like the following:
props.conf:
[someSourcetype]
TRANSFORMS-first = NoHeader
TRANSFORMS-second= csv-fieldextraction
transforms.conf:
[NoHeader]
REGEX = "Time","IOPS","Latency(ms)","BW (MBps)"
DEST_KEY = queue
FORMAT = nullQueue
[csv-fieldextraction]
DELIMS=","
FIELDS="Time","IOPS","Latency","BW (MBps)"
Still, the configuration is bond to a specific set of fields. Did you mean that or do you have something different in mind?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thats a good point. If so, it would not work for me because my field vary much...
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Benjamin,
I have the same probleme. At the moment a have to do a rex on every field I want to use and then convert it:
| rex field="FieldA" mode=sed "s/,/./"
| rex field="FieldB" mode=sed "s/,/./"
| convert num(Field*) | eval SumFields=FieldA+FieldB
Perhaps there is at least a way to reduce number of regex' needed, because I have many fields and they vary...
Best regards,
Thomas
