-
Notifications
You must be signed in to change notification settings - Fork 536
Description
While we want to migrate from the arrow2
crate to arrow
(#3741), it is a big task that we would rather punt on right now. It is technical debt, but the debt is not going to grow significantly. The gains don't justify the potential rabbit hole of paint it could turn into.
One of the major reasons to migrate away from arrow2
is because DataType
has a huge overhead, especially when cloned.
We have a PR to fix it (jorgecarleitao/arrow2#1469) but it is unmerged, because arrow2
in unmaintained.
So: we fork arrow2
as re_arrow2
, merge our PR, and solve our immediate memory issue.
Since polars
require arrow2
, we need to stop using it. We only have it for a few tests.
We should revisit the migrating away from arrow2 when we start exposing arrow things to the users (e.g. support data queries in the SDK) and/or when we want to interface with a some data frame crate.