undefined | Better HN

0 pointsyavorgiv2y ago0 comments

I am the creator of the repo.

Depends on the machine, number of threads selected and the model checkpoint used (Vit-B or Vit-L or Vit-B). The video demo attached is running on Apple M2 Ultra and using the Vit-B model. The generation of the image embedding takes ~1.9s there and all the subsequent mask segmentations take ~45ms.

However, I am now focusing on improving the inference speed by making better use of ggml and trying out quantization. Once I make some progress in this direction I will compare to other SAM alternatives and benchmark more thoroughly.

0 comments

billrobertson422y ago

This is amazing. Thank you!

j / k navigate · click thread line to collapse

0 pointsyavorgiv2y ago0 comments

I am the creator of the repo.