
上QQ阅读APP看书,第一时间看更新
Extracting a subset
Another typical task that you might want to perform is to first extract a particular characteristic of the data (such as patient "Expired"), and then perform a similar grouping to try to understand where the differences are. In this next example, we will also use the dplyr package to first extract those patients admitted to a hospital who died, and then summarize the TotalCosts by each of the major diagnostic classifications. Finally, we will order the costs by the most expensive diagnoses. As you can see from the results, infectious diseases have the highest costs associated with them:
df %>% filter(as.character(Patient.Disposition) == "Expired") %>%
group_by(APR.MDC.Description) %>%
summarize(total.count=n(),TotalCosts=sum(Total.Costs)) %>% arrange(desc(TotalCosts)) %>% head()
