So I have data in TSV format that I am indexing. Some of the fields are arrays in the format of ['23458567','234523456978090','234568957078654'] if the array is empty its simply filled with . When we do searches we have to
join tables and so some searches contain several joins to follow id's through the data flow. These ids are in the array format above and we sed out the single quotes, and the brackets, to get just values the mvexpand and join.
The problem I have is that when we do the sed, it removes the records that contain the empty array value  but those are valid values as well.
I was trying to do a conditional eval with a macro but that won't work or is not valid. Something like:
|eval RS=if(related_vendors == "", "", `fp_mvexpand(related_vendors)`)
This is what the macro does:
rex mode=sed field="$arg1$" "s///g" | rex mode=sed field="$arg1$" "s/'//g" | makemv delim="," $arg1$
We do this so we can join on the array values like:
|`init("assessments")` | fields id,info_subType,related_vendors,info_severity | dedup id | `fp_mvexpand(related_vendors)` | eval RV = mvindex(related_vendors,0) |join type=left RV [ `init("vendors")` |fields id infor_name |rename id as RV info_name as Vendor| fillnull value="none" Vendor| dedup Vendor] | stats count(Vendor) by info_subType
In this example, related_vendors in the the assessments table is the same as the id in the vendors table. So we strip out the brackets and single quotes and mvexpand, then mvindex and
join to vendors
But I don't get records where related_vendors =  , and I assume it's because we stripped out the  .
Any thoughts on how I could accomplish this?
Thanks for all the help everyone!
If you're doing a join based on RV, and if RV is null (blank as there are no ID in relatedvendors), then you wouldn't receive any matched values from relatedvendor tables as there is nothing matching. What do you want to do when there is no matching ID?
Did the answer below solve your problem? If so, please resolve this post by approving it!
If your problem is still not solved, keep us updated so that someone else can help ya.
Thanks for posting!
Just put another sed in front of the first, that changes the empty square braces to a placeholder value like
"!!!!". We often use that value for delimiters, since four exclamation points together almost never appear in typical business data.
something like that should change your "empty" square braces into square braces that contain a single flag value. Run that through the rest of your routine, including the join, and then at the end,
sed away the dummy value.
By the way, you can do multiple seds in one rex command, just put a space between them. Try this:
rex mode=sed field="$arg1$" "s/\[\]/\[!!!!\]/g s///g s/'//g" | makemv delim="," $arg1$
Then remember to sed it again at the end after the join, to kill the dummy value.
rex mode=sed field="$arg1$" "s/!!!!//g"