R Tips-n-Tricks
Show ALL duplicate values
|>
df group_by(ID) |>
filter(
duplicated(COV) |
duplicated(COV, fromLast = TRUE)
)
Find first occurence of a non-NA
$col1[which.max(!is.na(col2))] df
Pipe into View(), with title
|>
df |>
filter |>
group_by View("df")
Rmarkdown
- ref.label=c()
- https://bookdown.org/yihui/rmarkdown-cookbook/reuse-chunks.html#ref-label
Style
- 80 characters per line
- 150 lines per script
Misc
- Use sink() to record output of script
- Use locate() instead of select(col, everything())
- R-tips
- https://paulvanderlaken.com/2018/05/21/r-tips-and-tricks/
- https://guiastrennec.gitbooks.io/r-notes/content/
- latticeExtra (package)
- naniar (package)
- gg_miss_var()
- gg_miss_upset() (instead of venn diagrams)
- geom_miss_point() (jitter of missing)
- bind_shadow()
- vis_miss (Credit: Iris Minichmayr)
- txtProgressBar
- (single=3)
- If write files on cluster for use in next step
- sys.sleep(3)
- Bang Bang – How to program with dplyr
- https://www.statworx.com/de/blog/bang-bang-how-to-program-with-dplyr/
- Fundamentals of Data Visualization, Claus O. Wilke
- https://serialmentor.com/dataviz/
- David Robinson:
- libr
- ggplot:
- theme_set(theme_bw())
- mutate(a_col = fct_reorder(a_col, by)) %>% ggplot()
- ggplot(aes(fill = x)) + theme(legend.position = “none”)
- expand_limits(y = 0) #to include 0 on the y axis
- scale_y_continuous(labels = scales::dollar_format())
- scales:: percent_format()
- ggplot(label=a_column) plotly::ggplotly(ggplot_object)
- library(Hmisc, include.only = ‘%nin%’)
- use %in% to make 1==NA equal FALSE and not NA
- Also check for NAs
- any: NA %in% vector
- sum: vector %in% NA %>% sum