combine rows in data frame containing NA to make a complete row

Sometimes we want to combine rows with overlapping information but separated in different rows, which is really annoying for downstream analysis. I couldn’t come up with an elegant solution until finding an answer on stackoverflow(T 2017). To prevent me from struggling to find the answer again, I decide to make a note here.

Below is the minimal example.

library(tidyverse)
df <- data.frame(A = c(1,1,2,2,2),
                 B = c(NA,2,NA,4,4),
                 C = c(3,NA,NA,5,NA),
                 D = c(NA,2,3,NA,NA),
                 E = c(5,NA,NA,4,4))
df
##   A  B  C  D  E
## 1 1 NA  3 NA  5
## 2 1  2 NA  2 NA
## 3 2 NA NA  3 NA
## 4 2  4  5 NA  4
## 5 2  4 NA NA  4

We use na.omit and unique to create a coalesce and then summerize_all the data for each column to merge rows.

df %>% group_by(A) %>% summarise_all(list( ~ na.omit(unique(.))) )
## # A tibble: 2 x 5
##       A     B     C     D     E
##   <dbl> <dbl> <dbl> <dbl> <dbl>
## 1     1     2     3     2     5
## 2     2     4     5     3     4

Now we’re done.

T, Jerry. 2017. “Combine Rows in Data Frame Containing Na to Make Complete Row.” Stack Overflow. Stack Overflow. https://stackoverflow.com/a/47563618/10538503.