I have a CSV with numerous fields with bad field names. They have spaces and special characters such as up and down arrows.
I don't know ahead of time what the field names will be.
How do I locate and rename all of them to more "safe" Splunk field names that work easily in all Splunk commands without funky syntax?
Use the power of foreach to loop through the different fields, combined with | eval {a}=b syntax to create a field whose name is the value of a and value is the value of b.
Here is my search. As all my problem fields had spaces in them, I aimed foreach just at field names with spaces ("* *").
This copy has comments to explain each part, and the one below is comment-free:
``` in every event/result row, loop over all field names with spaces in them ```
| foreach "* *" [
``` create a temporary string with the name of the field to be renamed ```
eval x="<<FIELD>>",
``` tidy up the field name by doing string substitution - put any regular expression in the center. (This one drops all characters that are not spaces or in Splunk's set of allowable characters.) First it does a regex replace, then trims leading/trailing spaces, then replaces all remaining spaces with underscores ```
y=replace(trim(replace(x, "[^A-Za-z0-9_ ]", "")), " ", "_"),
``` create a new field with the tidied name and the value of the original - use single quotes because <<FIELD>> has spaces and special characters, and we need to get its value (single quotes), not its name (double quotes). ```
{y}='<<FIELD>>'
``` remove the temporary fields and the poorly-named original field ```
| fields - x y "<<FIELD>>" ]
Clean version:
| foreach "* *" [ eval x="<<FIELD>>", y=replace(trim(replace(x, "[^A-Za-z0-9_ ]", "")), " ", "_"), {y}='<<FIELD>>' | fields - x y "<<FIELD>>" ]
Another improvement/variant you can do is: before running the inner "replace all special characters with nothing", you could "replace all dot and dash to underscore" - then continue with the rest as written.
Use the power of foreach to loop through the different fields, combined with | eval {a}=b syntax to create a field whose name is the value of a and value is the value of b.
Here is my search. As all my problem fields had spaces in them, I aimed foreach just at field names with spaces ("* *").
This copy has comments to explain each part, and the one below is comment-free:
``` in every event/result row, loop over all field names with spaces in them ```
| foreach "* *" [
``` create a temporary string with the name of the field to be renamed ```
eval x="<<FIELD>>",
``` tidy up the field name by doing string substitution - put any regular expression in the center. (This one drops all characters that are not spaces or in Splunk's set of allowable characters.) First it does a regex replace, then trims leading/trailing spaces, then replaces all remaining spaces with underscores ```
y=replace(trim(replace(x, "[^A-Za-z0-9_ ]", "")), " ", "_"),
``` create a new field with the tidied name and the value of the original - use single quotes because <<FIELD>> has spaces and special characters, and we need to get its value (single quotes), not its name (double quotes). ```
{y}='<<FIELD>>'
``` remove the temporary fields and the poorly-named original field ```
| fields - x y "<<FIELD>>" ]
Clean version:
| foreach "* *" [ eval x="<<FIELD>>", y=replace(trim(replace(x, "[^A-Za-z0-9_ ]", "")), " ", "_"), {y}='<<FIELD>>' | fields - x y "<<FIELD>>" ]
Another improvement/variant you can do is: before running the inner "replace all special characters with nothing", you could "replace all dot and dash to underscore" - then continue with the rest as written.