Attempting to send a CSV file, but it's a bit messy. I need to remove some entries that aren't formatted correctly, delete the header row, and replace it with my own (hence
FIELD_NAMES). Data is on a UF and goes to my IDX. I'm not using
INDEXED_EXTRACTIONS on the UF because the .csv file isn't clean/properly formatted, so I have my IDX doing the work.
- event breaking
- removing improperly formatted entries
- removed original header
- my field names (nothing is parsed when searching, my
FIELD_NAMES are missing).
Based on this, I'm thinking that the old header isn't stripped until it reaches typingQueue (
TRANSFORMS), but my
FIELD_NAMES is trying to be applied at the aggQueue so it isn't working...but I'm not sure. How to fix this?
[monitor://C:\test\testfile_*.csv] index = main sourcetype = test crcSalt = <SOURCE> queue = parsingQueue disbled = 0
[test] SHOULD_LINEMERGE = false FIELD_NAMES = contentID,moduleName,levelName,date,loginID,last,first,var1,var2,var3,var4 FIELD_DELIMITER = , TIME_FORMAT = %F %T.%3Q TZ = UTC TRANSFORMS-null_hdr_and_nonevt = del_hdr,del_nonevt
[del_hdr] REGEX = ^ContentID.* DEST_KEY = queue FORMAT = nullQueue [del_nonevt] REGEX = ^(?!\d+,).* DEST_KEY = queue FORMAT = nullQueue
FIELD_DELIMITER attributes only apply when
INDEXED_EXTRACTIONS is set.
@richgalloway Please post your comment as an answer so I can accept it since it does explain why my FIELD_NAMES isn't working. I think I will have to use search time field extractions to get my data parsed (I don't see any other alternatives). Should be simple since it is comma separated.
There may be another option. Try adding a third transform that parses the CSV.
TRANSFORMS-nullhdrandnonevt = delhdr,delnonevt, parsetest
REGEX = ([^,]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+),([^.]+)
DESTKEY = _raw
FORMAT = contentID=$1,moduleName=$2,levelName=$3,date=$4,loginID=$5,last=$6,first=$7,var1=$8,var2=$9,var3=$10,var4=$11
@richgalloway that worked perfectly! The docs on how to use the REGEX/DEST_KEY/FORMAT are not that great, your write-up makes much more sense. Thank you.
Are you absolutely sure that you have exactly 11 fields? I think not; where, for example, is the time field? You must list them all.
@woodcock yes, the timestamp is in the date field (it follows my TIMEFORMAT entry). I already tried using `TIMESTAMPFIELDS = date
but that messed everything up because I'm not usingINDEXED_EXTRACTIONS`. How can I set these up so everything is parsed?
You should also be using
TIME_PREFIX but that should not have anything to do with why the fields are not working. I would open a support case.