Apply a counting summary function like dplyr::n_distinct()
or count_na()
to every column of a data frame and return the results along with a
percentage of that value.
col_stats(data, fun, print = TRUE)
glimpse_fun(data, fun, print = TRUE)
A data frame to glimpse.
A function to map to each column.
logical; Should all columns be printed as rows?
A tibble with a row for every column with the count and proportion.
col_stats(dplyr::storms, dplyr::n_distinct)
#> # A tibble: 13 × 4
#> col class n p
#> <chr> <chr> <int> <dbl>
#> 1 name <chr> 258 0.0135
#> 2 year <dbl> 47 0.00247
#> 3 month <dbl> 10 0.000524
#> 4 day <int> 31 0.00163
#> 5 hour <dbl> 24 0.00126
#> 6 lat <dbl> 550 0.0288
#> 7 long <dbl> 1000 0.0524
#> 8 status <fct> 9 0.000472
#> 9 category <dbl> 6 0.000315
#> 10 wind <int> 33 0.00173
#> 11 pressure <int> 129 0.00677
#> 12 tropicalstorm_force_diameter <int> 139 0.00729
#> 13 hurricane_force_diameter <int> 42 0.00220
col_stats(dplyr::storms, campfin::count_na)
#> # A tibble: 13 × 4
#> col class n p
#> <chr> <chr> <int> <dbl>
#> 1 name <chr> 0 0
#> 2 year <dbl> 0 0
#> 3 month <dbl> 0 0
#> 4 day <int> 0 0
#> 5 hour <dbl> 0 0
#> 6 lat <dbl> 0 0
#> 7 long <dbl> 0 0
#> 8 status <fct> 0 0
#> 9 category <dbl> 14382 0.754
#> 10 wind <int> 0 0
#> 11 pressure <int> 0 0
#> 12 tropicalstorm_force_diameter <int> 9512 0.499
#> 13 hurricane_force_diameter <int> 9512 0.499