Skip to content

Conversation

@DavisVaughan
Copy link
Member

Closes r-lib/vctrs#892

I've reworked new_tibble() on top of new_data_frame(). It is faster now than before in some cases, partly due to avoiding setdiff() for class when we don't need it.

library(tibble)

# empty
y <- list()
y_nrow <- 0L

bench::mark(
  tbl = new_tibble(y, nrow = y_nrow),
  iterations = 100000
)

# master
#> # A tibble: 1 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 tbl          11.7µs   16.8µs    56017.    27.9KB     61.7

# this PR
#> # A tibble: 1 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 tbl          8.76µs   11.5µs    77859.    32.9KB     17.9
x <- unclass(mtcars)
nrow <- 32L

bench::mark(
  tbl = new_tibble(x, nrow = nrow),
  iterations = 100000
)

# master
#> # A tibble: 1 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 tbl          10.8µs   14.2µs    69187.        0B     27.0

# this PR
#> # A tibble: 1 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 tbl           9.4µs   12.6µs    78395.        0B     15.7
bench::mark(
  tbl = new_tibble(x, nrow = nrow, foo = "bar", foo2 = 1, foo3 = "x"),
  iterations = 100000
)

# master
#> # A tibble: 1 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 tbl          12.5µs   16.9µs    59008.        0B     27.7

# this PR
#> # A tibble: 1 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 tbl          12.7µs   17.2µs    57369.        0B     18.4

@krlmlr
Copy link
Member

krlmlr commented Mar 7, 2020

Thanks. I've arrived to a similar code when trying to use new_data_frame(). At this time I feel it's too much extra code, I think we can do much better with an improved version of new_data_frame() (r-lib/vctrs#892).

If the computation of the "class" attribute becomes an issue, we can move to C.

Happy to take the tests. Do they pass with dev tibble?

@krlmlr krlmlr added this to the 3.0.2 milestone Jul 6, 2020
@krlmlr krlmlr modified the milestones: 3.0.2, 3.1.0 Sep 27, 2020
@krlmlr krlmlr modified the milestones: 3.1.0, 3.1.1 Feb 23, 2021
@krlmlr krlmlr modified the milestones: 3.1.1, 3.1.2 Apr 16, 2021
@krlmlr krlmlr modified the milestones: 3.1.2, 3.1.3 Jun 24, 2021
@mgirlich mgirlich mentioned this pull request Jul 5, 2021
@krlmlr krlmlr modified the milestones: 3.1.3, 3.1.4 Jul 17, 2021
@krlmlr krlmlr force-pushed the tbl-new-data-frame branch from af2b1c8 to 6c179fb Compare July 29, 2021 09:28
@krlmlr krlmlr merged commit 6adac52 into tidyverse:master Jul 29, 2021
@krlmlr
Copy link
Member

krlmlr commented Jul 29, 2021

Thanks!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tweak new_data_frame() for better tibble support

2 participants