Beginner R: Using Incident Data, Create a Series of New Dataframes with Sums of Categorical Variables -
i have set of incident data following rough format:
incident # date year state criminal offense location
155k incidents. want create new series of dataframes group ungrouped data (ie, opposite of first step in link: http://ww2.coastal.edu/kingw/statistics/r-tutorials/descriptive.html). want dataframes each year/each state totals of each categorical count each of last 2 columns above, "offense" , "location" (but there ever 1 row each year-state combination) 2 separate dataframes:
year state sum of criminal offense 1 sum of criminal offense 2 sum of crim 3
and
year state sum of location 1 sum of location 2 sum of location 3
the goal comparisons of incident counts state on time, or time-series predictions total incidents of crime type in state. ungrouped data? there resource or brief rules analyses/predictive approaches work best/most practically grouped versus ungrouped data?
here approach using table
count entities , reshape
put desired form.
fake data:
d <- data.frame(incident=1:4, year=c(1,1,2,2), state=c('al','mn','al','mn'),offense=c(1,1,1,2),location=c(1,2,2,2)) d ## incident year state offense location ## 1 1 1 al 1 1 ## 2 2 1 mn 1 2 ## 3 3 2 al 1 2 ## 4 4 2 mn 2 2
locations:
dl <- as.data.frame(xtabs(~year+state+location, data=d)) # dl <- as.data.frame(table(year=d$year, state=d$state, location=d$location)) reshape(dl, direction='wide', timevar='location', idvar=c('year', 'state')) ## year state freq.1 freq.2 ## 1 1 al 1 0 ## 2 2 al 0 1 ## 3 1 mn 0 1 ## 4 2 mn 0 1
offenses:
do <- as.data.frame(xtabs(~year+state+offense, data=d)) # <- as.data.frame(table(year=d$year, state=d$state, offense=d$offense)) reshape(do, direction='wide', timevar='offense', idvar=c('year', 'state')) ## year state freq.1 freq.2 ## 1 1 al 1 0 ## 2 2 al 1 0 ## 3 1 mn 1 0 ## 4 2 mn 0 1
Comments
Post a Comment