Skip to content

Bug: unsigned uint8 misbehaves when building an index #595

@liquidcarbon

Description

@liquidcarbon

Describe the bug

Why does the index and and distance calculations become all zeroes?

Steps to reproduce

index = Index(ndim=3)
a = np.uint8([
    [0, 0, 1],
    [0, 1, 2],
    [1, 2, 3],
])
index.add([0,1,2], a)
for i in range(3):
    print(index[i])
pd.DataFrame([r for r in index.search(a, 4)])

Image

Expected behavior

If you do this with DuckDB:

df = pd.DataFrame({"idx": [0,1,2], "vec": [v for v in a]})
duckdb.sql("""
SELECT a.idx, b.idx, LIST_DISTANCE(a.vec, b.vec)
FROM df a JOIN df b ON 1=1
""").df()

Image

USearch version

2.17.7

Operating System

Amazon Linux

Hardware architecture

x86

Which interface are you using?

Python bindings

Contact Details

No response

Are you open to being tagged as a contributor?

  • I am open to being mentioned in the project .git history as a contributor

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingv3Breaking changes planned for v3

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions