You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Background:**
The slow step of LSI is computing the SVD (singular value decomposition)
of a matrix. Even with a relatively small collection of documents (say,
about 20 blog posts), the native ruby implementation is too slow to be
usable (taking hours to complete).
To work around this problem, classifier-reborn allows you to optionally
use the `gsl` gem to make use of the [Gnu Scientific
Library](https://www.gnu.org/software/gsl/) when performing matrix
calculations. Computations with this gem perform orders of magnitude
faster than the ruby-only matrix implementation, and they're fast enough
that using LSI with Jekyll finishes in a reasonable amount of time
(seconds).
Unfortunately, [rb-gsl](https://github.com/SciRuby/rb-gsl) is
unmaintained -- there's a commit on main that makes it compatible with
Ruby 3, but nobody has released the gem so the only way to use rb-gsl
with Ruby 3 right now is to specify the git hash in your Gemfile. See
SciRuby/rb-gsl#67. This will be increasingly
problematic because Ruby 2.7 is now in [security
maintenance](https://www.ruby-lang.org/en/news/2022/04/12/ruby-2-7-6-released/)
and will become end of life in less than a year.
Notably, `rb-gsl` depends on the
[narray](https://github.com/masa16/narray#new-version-is-under-development---rubynumonarray)
gem. `narray` is deprecated, and the readme suggests using
`Numo::NArray` instead.
**Changes:**
In this PR, my goal is to provide an alternative matrix implementation
that can perform singular value decomposition quickly and works with
Ruby 3. Doing so will make classifier-reborn compatible with Ruby 3
without depending on the unmaintained/unreleased gsl gem. There aren't
many gems that provide fast matrix support for ruby, but
[Numo](https://github.com/ruby-numo) seems to be more actively
maintained than rb-gsl, and Numo has a working Ruby 3 implementation
that can perform a singular value decomposition, which is exactly what
we need. This requires
[numo-narray](https://github.com/ruby-numo/numo-narray) and
[numo-linalg](https://github.com/ruby-numo/numo-linalg).
My goal is to allow users to (optionally) use classifier-reborn with
Numo/Lapack the same way they'd use it with GSL. That is, the user
should install the `numo-narray` and `numo-linalg` gems (with their
required C libraries), and classifier-reborn will detect and use these
if they are found.
0 commit comments