---
title: "FAQ"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{FAQ}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Is `tidypolars` slower than `polars`?

No, or just marginally. The objective of `tidypolars` is *not* to modify the
data, simply to translate the `tidyverse` syntax to `polars` syntax. `polars`
is still in charge of doing all the data manipulations under the hood.

Therefore, there might be minor overhead because we still need to parse the 
expressions and rewrite them in `polars` syntax (see the [Parsing expressions](https://tidypolars.etiennebacher.com/articles/parsing-expressions.html)
vignette) but this should be marginal. Here's a small benchmark to compare the
performance of `polars` and `tidypolars`:

```{r}
library(polars)
library(tidypolars)
library(dplyr, warn.conflicts = FALSE)

pl_test <- pl$DataFrame(
  grp = sample(letters, 2*1e7, TRUE),
  val1 = sample(1:1000, 2*1e7, TRUE),
  val2 = sample(1:1000, 2*1e7, TRUE)
)

bench::mark(
  polars = pl_test$
    group_by("grp")$
    agg(
      pl$col("val1")$mean()$alias("x"), 
      pl$col("val2")$sum()$alias("y"),
      pl$col("val1")$median()$alias("z")
    ),
  tidypolars = pl_test |> 
    group_by(grp) |> 
    summarize(
      x = mean(val1),
      y = sum(val2),
      z = median(val1)
    ),
  check = FALSE,
  iterations = 15
)

bench::mark(
  polars = pl_test$
    filter(pl$col("grp") == "a" | pl$col("grp") == "b"),
  tidypolars = pl_test |> 
    filter(grp == "a" | grp == "b"),
  check = FALSE,
  iterations = 15
)
```

# Am I stuck with `tidypolars`?

No, `tidypolars` will always return `DataFrame`s, `LazyFrame`s or `Series`.
Therefore, if at some point you want to use `polars` because you need more
control or because you want to reduce your number of dependencies, you can
easily do so.

# Do I still need to load `polars`?

Yes, because `tidypolars` doesn't provide any functions to create `polars`
`DataFrame` or `LazyFrame` or to read data. You'll still need to use `polars`
for this.


# Can I see some benchmarks with other tools?

Making accurate benchmarks of data wrangling tools is difficult and I won't try
to do it here. You should refer to [DuckDB benchmarks](https://duckdblabs.github.io/db-benchmark/).