Skip to content

Improve documentation about what models can be used #11

@do-me

Description

@do-me

Hi, thank you so much for open sourcing this repo. It's amazing and super useful to quickly get some insight into your data.
I created this Foursquare POI app for Italy with 3M points (GitHub) and it really helps me understand the data better! Operating at the limit of GitHub pages' free hosting with a 93Mb data file :D

I was just trying to figure out what kind of models could be used in embedding-atlas as I noticed you only mentioned the kind of dated all-MiniLM-L6-v2 in the API docs. I didn't quite dig in the code too much yet but at first glance it seems only the API version (and not the frontend) supports inferencing right?

So it would be great if you could:

  • add some docs about what models work and what backend is used for inferencing in the API version (to understand whether it's already optimized for MPS for example)
  • add support for frontend inferencing, e.g. with transformers.js. They already support WebGPU too. Just note that the batch size is absolutely crucial for speed, e.g. have a look at this demo I created to test batching. For reference: indexing the whole bible with 128 as batch size takes 35 seconds for 95605 chunks (M3 Max)
Batch size: 128
Chunks: 95605
Time passed: 35336.00 ms
Embeddings per second: 2705.60

Batch sizes of 64 or 256 instead roughly double the time needed on my device.

Apart from these standard models, I wanted to propose @MinishLab's static models (ideally with model2vec-rs) too as they are much faster on CPU and even rank higher on MTEB :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions