Attempting to send a CSV file, but it's a bit messy. I need to remove some entries that aren't formatted correctly, delete the header row, and replace it with my own (hence FIELD_NAMES
). Data is on a UF and goes to my IDX. I'm not using INDEXED_EXTRACTIONS
on the UF because the .csv file isn't clean/properly formatted, so I have my IDX doing the work.
working
- event breaking
- removing improperly formatted entries
- removed original header
not working
- my field names (nothing is parsed when searching, my FIELD_NAMES
are missing).
Based on this, I'm thinking that the old header isn't stripped until it reaches typingQueue ( TRANSFORMS
), but my FIELD_NAMES
is trying to be applied at the aggQueue so it isn't working...but I'm not sure. How to fix this?
UF inputs
[monitor://C:\test\testfile_*.csv]
index = main
sourcetype = test
crcSalt = <SOURCE>
queue = parsingQueue
disbled = 0
IDX props
[test]
SHOULD_LINEMERGE = false
FIELD_NAMES = contentID,moduleName,levelName,date,loginID,last,first,var1,var2,var3,var4
FIELD_DELIMITER = ,
TIME_FORMAT = %F %T.%3Q
TZ = UTC
TRANSFORMS-null_hdr_and_nonevt = del_hdr,del_nonevt
IDX transforms
[del_hdr]
REGEX = ^ContentID.*
DEST_KEY = queue
FORMAT = nullQueue
[del_nonevt]
REGEX = ^(?!\d+,).*
DEST_KEY = queue
FORMAT = nullQueue
The FIELD_NAMES
and FIELD_DELIMITER
attributes only apply when INDEXED_EXTRACTIONS
is set.
Are you absolutely sure that you have exactly 11 fields? I think not; where, for example, is the time field? You must list them all.
@woodcock yes, the timestamp is in the date field (it follows my TIME_FORMAT entry). I already tried using TIMESTAMP_FIELDS = date
but that messed everything up because I'm not using INDEXED_EXTRACTIONS
. How can I set these up so everything is parsed?
You should also be using TIME_PREFIX
but that should not have anything to do with why the fields are not working. I would open a support case.
The FIELD_NAMES
and FIELD_DELIMITER
attributes only apply when INDEXED_EXTRACTIONS
is set.
@richgalloway Please post your comment as an answer so I can accept it since it does explain why my FIELD_NAMES isn't working. I think I will have to use search time field extractions to get my data parsed (I don't see any other alternatives). Should be simple since it is comma separated.
There may be another option. Try adding a third transform that parses the CSV.
Props.conf
...
TRANSFORMS-null_hdr_and_nonevt = del_hdr,del_nonevt, parse_test
Transforms.conf
...
[parse_test]
REGEX = ([^,]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+)
DEST_KEY = _raw
FORMAT = contentID=$1,moduleName=$2,levelName=$3,date=$4,loginID=$5,last=$6,first=$7,var1=$8,var2=$9,var3=$10,var4=$11
@richgalloway that worked perfectly! The docs on how to use the REGEX/DEST_KEY/FORMAT are not that great, your write-up makes much more sense. Thank you.