About johnsmith454

johnsmith454 · ‎09-21-2020

I have tried using answers to similar questions on here, but I'm having a problem where I want to create a column of 4 labels. However, when I try to create these, the labels I make eat into re-labeling the first label I have assigned. For example, I am looking to create a label a column like this: Gene Feature1 Feature2 Feature3 ... label Gene1 1 3 1 most likely Gene2 0 0 1 probable Gene3 NA NA NA unknown Gene4 0 0 0 unlikely However, my data is imported from big data analysis and so my features are not represented here, but the 4 labels are what I'm trying to get. I try to code this with: df$label[(df$Mechanism == 1)|(df$med >= 3) |(df$OMIM == 1)] <- "most likely" df$label[is.na(df$label) & (df$med <= 2 )|(df$SideeffectFreq>=1) |(df$MGI_Gene==1) |(df$model_Gene==1) |(df$Rank>=1) ] <- "probable" df$label[(df$Causality == 'least likely')] <- "least likely" df$label[is.na(df$label)] <- "unknown" When I run the first line to create the "most likely" label, this labels 50 genes (which is what I expected), but running the second line for "probable" re-labels some of the "most likely" genes to only give 34 of them left. I thought using is.na(df$label) or (df$label != 'most likely') would resolve this, but neither do. Is there a better way to go about creating a labels column like this? I am new to coding so also if anyone can explain why the is.na(df$label) or (df$label != 'most likely') do not work as I expected that would also be really helpful. Edit: Example where 'most likely' label is taken up: #Input data: dput(dt) structure(list(Gene = c("gene1", "gene2", "gene3", "gene4"), F1 = c(1L, 0L, 0L, 1L), F2 = c(3L, 0L, 0L, 1L), F3 = c("1", "1", "1", "least likely"), label = c(NA, NA, NA, NA)), row.names = c(NA, -4L), class = c("data.table", "data.frame")) dt$label[(dt$F1 == 1)|(dt$F2 >= 3) |(dt$F1 == 1)] <- "most likely" dt$label[(dt$label != 'most likely') & (dt$F1 == 2)|(dt$F2 == 0) |(dt$F1 == 1)] <- "probable" dt$label[(dt$F1 == 0)|(dt$F2 == 0)] <- "unlikely" dt$label[(dt$F3 == 'least likely')] <- "unknown"

Posts	1
Solutions	0
Karma Given	1
Karma Received	1
Member Since	‎09-21-2020

Online Status	Offline
Date Last Visited	‎09-23-2020 04:36 AM

How to create a column labeling rows based on cond...

How to create a column labeling rows based on cond...

Join the Conversation

How to create a column labeling rows based on cond...

How to create a column labeling rows based on cond...