Skip to content

float16 support in web runtimes #9758

@josephrocca

Description

@josephrocca

Is your feature request related to a problem? Please describe.
It's seemingly quite common for models to use float16, and the web runtimes don't support that data type. It's possible to convert float16 values to float32 values, and this may work fine for most models, but this still results in .onnx files that are twice as big as they need to be.

I'm not sure how the tensor datatypes map to actual wasm/js datatypes, but even if float16 were in some sense "emulated", that would be useful in reducing the .onnx file sizes, even if it has no impact on performance. It would also make the web runtime compatible with more models so that it works "out of the box", instead of requiring (what was for me, as a noob) a tedious process of converting float16 to float32.

I haven't tested it, but I think TF.js can do a sort of "rehydration" of float16 values to float32 after downloading (at least, it seems like it going by this pull request that I came across - though it's aimed at post-training quantization). If that is indeed the case, then perhaps their implementation/choices would be a useful reference.

Here are the currently-supported data types for the wasm runtime:
Screenshot from 2021-11-15 01-44-37

System information

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    platform:webissues related to ONNX Runtime web; typically submitted using templatestaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions