reshape - Reproduce a datset to different format in R -


i have dataset data below:

dput(data) structure(list(fn = structure(c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l,  1l), .label = "20131202-0985 ", class = "factor"), values = structure(c(1l,  8l, 7l, 6l, 5l, 9l, 2l, 4l, 3l), .label = c("|639778|21|nanyang circle|103.686721631628|1.34640300329567",  "|8121|b01|somerset stn", "|96942883", "|sn30|smrt\n", "central",  "four seasons hotel", "hotel", "ikea", "nanyang avenue"), class = "factor"),      ind = structure(c(4l, 1l, 1l, 1l, 1l, 6l, 3l, 2l, 5l), .label = c("bn",      "br", "bs", "loc", "pn", "rn"), class = "factor")), .names = c("fn",  "values", "ind"), class = "data.frame", row.names = c(na, -9l )) 

enter image description here wanted above dataset converted in below format data frame(out_data). presently data has 3 columns - , need covert these 16 columns in below format. need rehape input - given in screenshot data frame. cannot change below structure -

colnames(out_data) <- ("fn","h_blk","s_n/r_n","b_n","fl_n","u_n","pc","xc","yc","bs","brf","lct_dec","brn","bo  pn","s_ty_cd") 

enter image description here

the multiple value columns in inputnand in below format:

  • |639778|21|nanyang circle|103.686721631628|1.34640300329567 - |pc|h_blk|s_n/r_n|xc|yc
  • |8121|b01|somerset stn -> |bs|brf|lct_dec
  • |sn30|smrt ------> |brn|bo

if the

ind =loc - |pc|h_blk|s_n/r_n|xc|yc`  updated s_ty_cd=loc ind= bn - b_n column should updated s_ty_cd=bn ind= rn - _n/r_n column should updated s_ty_cd=rn ind= bs `|bs|brf|lct_dec` should updated s_ty_cd=bs ind= br `|brn|bo` should updated s_ty_cd=br ind= pn pn s_ty_cd=pn 

is there efficient way of doing this.

here's 1 method of transformation. first define helper functions various sub problems.

#define  out cols outcols<-c("fn", "h_blk", "s_n/r_n", "b_n", "fl_n", "u_n", "pc",      "xc", "yc", "bs", "brf", "lct_dec", "brn","bo","pn","s_ty_cd")  #identify parts each compound value namevals <- function(ind, vals) {     names<-if (ind=="loc") {         c("pc","h_blk","s_n/r_n","xc","yc")     } else if (ind=="bn") {         c("b_n")     } else if (ind=="rn") {         c("s_n/r_n")     } else if (ind=="bs") {         c("bs","brf","lct_dec")     } else if (ind=="br") {         c("brn","bo")     } else if (ind=="pn") {         c("pn")     }     stopifnot(length(names)==length(vals))     stopifnot(all(names %in% outcols))     names(vals)<-names     vals }  #add missing values row fillrow <- function(nvals) {     r<-rep(na, length(outcols))     r[match(names(nvals), outcols)]<-nvals     r } 

now apply these each row of data mapply return character vector. here make sure split "values" column on pipe , remove leading pipe.

#combine rows character matrix dt<-mapply(function(fn,vals,ind){        x<-c(fn=fn,namevals(ind, vals), "s_ty_cd"=ind)     fillrow(x)   },    as.character(data$fn),    strsplit(gsub("^\\|","",as.character(data$values)),"|", fixed=t),    as.character(data$ind) ) 

finally tidy data can written out file write.table. note missing values true r na values. in write.table, can set na = "" if you'd rather print out blank values default "na" value.

#turn matrix data.frame proper names dd<-data.frame(unname(t(dt)), stringsasfactors=f) names(dd)<-outcols dd 

Comments

Popular posts from this blog

c++ - OpenCV Error: Assertion failed <scn == 3 ::scn == 4> in unknown function, -

php - render data via PDO::FETCH_FUNC vs loop -

The canvas has been tainted by cross-origin data in chrome only -