r - How to make sublist/extract expression data of candidate genes from normalized microarray list -
i have several processed microarray data (normalized, .txt files) want extract list of 300 candidate genes (ilmn_ids). need in output not gene names, expression values , statistics info (already present in original file). have 2 dataframes:
normalizeddata
identifiers (gene names) in first column, named "name".candidategenes
single column named "name", containing identifiers.
i've tried
1).
all=normalizeddata subset=candidategenes x=all%in%subset
2).
all[which(all$gene_id %in% subset)] #(as suggested in other bioinf. forum)#,
but returns dataframe 0 columns , >4000 rows. not correct, since normalizeddata has 24 columns , compare them, error.
the key able compare first column of ("name") subset. here info:
> class(all) > [1] "data.frame" > dim(all) > [1] 4312 24 > str(all) > 'data.frame':4312 obs. of 24 variables: $ name: factor w/ 4312 levels "ilmn_1651253": 3401.. $ meanbgt:num 0 .. $ meanbgc: num .. $ cvt: num 0.11 .. $ cvc: num 0.23 .. $ meant: num 4618 .. $ stderrt: num 314.6 .. $ meanc: num 113.8 ... $ stderrc: num 15.6 ... $ ratio: num 40.6 ... $ ratiose: num 6.21 ... $ logratio: num 5.34 ... $ tp: num 1.3e-04 ... $ t2p: num 0.00476 ... $ wilcoxonp: num 0.0809 ... $ tq: num 0.0256 ... $ t2q: num 0.165 ... $ wilcoxonq: num 0.346 ... $ limmap: num 4.03e-10 ... $ limmapa: num 4.34e-06 ... $ symbol: factor w/ 3696 levels "","a2ld1",.. $ ensembl: factor w/ 3143 levels "ensg00000000003",..
and here info subset:
> class(subset) [1] "data.frame" > dim(subset) >[1] 328 1 > str(subset) 'data.frame': 328 obs. of 1 variable: $ v1: factor w/ 328 levels "ilmn_1651429",..: 177 286 47 169 123 109 268 284 234 186 ...
i appreciate help!
what need is
all[all$name %in% subset$v1, ]
when using data.frame, it's important drill down the correct column has data want use. need know columns have matching ids. way solution differed other suggested or other things you've tried.
it's important note when subsetting data.frame rows, need use [,]
syntax vector before comma indicates rows , vector after indicates columns. here, since want columns, leave empty.
Comments
Post a Comment