undefined | Better HN

0 pointsein0p1y ago0 comments

Sometimes TPS doesn't matter. I've generated textual descriptions for 100K or so images in my photo archive, some of which I have absolutely no interest in uploading to someone else's computer. This works pretty well with Gemma. I use local LLMs all the time for things where privacy is even remotely important. I estimate this constitutes easily a quarter of my LLM usage.

0 comments

7 comments · 2 top-level

lodovic1y ago· 4 in thread

This is a really cool idea. Do you pretrain the model so it can tag people? I have so many photo's that it seems impossible to ever categorize them,using a workflow like yours might help a lot

ein0pOP1y ago

No, tagging of people is already handled by another model. Gemma just describes what's in the image, and produces a comma separated list of keywords. No additional training is required besides a few tweaks to the prompt so that it outputs just the description, without any "fluff". E.g. it normally prepends such outputs with "Here's a description of the image:" unless you really insist that it should output only the description. I suppose I could use constrained decoding into JSON or something to achieve the same, but I didn't mess with that.

On some images where Gemma3 struggles Mistral Small produces better descriptions, BTW. But it seems harder to make it follow my instructions exactly.

I'm looking forward to the day when I can also do this with videos, a lot of which I also have no interest in uploading to someone else's computer.

mentalgear1y ago

Since you already seem to have done some impressive work on this for your personal use, would you mind open sourcing it?

ethersteeds1y ago

> No, tagging of people is already handled by another model.

As an aside, what model/tools do you prefer for tagging people?

fer1y ago

How do you use the keywords after? I have Immich running which does some analysis, but the querying is a bit of a hit and miss.

1 more reply

starik361y ago· 1 in thread

I was thinking of doing the same, but I would like to include people's name. in the description. For example "Jennifer looking out in the desert sky.".

As it stands, Gemma will just say "Woman looking out in the desert sky."

ein0pOP1y ago

Most search rankers do not consider word order, so if you could also append the person's name at the end of text description, it'd probably work well enough for retrieval and ranking at least.

If you want natural language to resolve the names, that'd at a minimum require bounding boxes of the faces and their corresponding names. It'd also require either preprocessing, or specialized training, or both. To my knowledge no locally-hostable model as of today has that. I don't know if any proprietary models can do this either, but it's certainly worth a try - they might just do it. The vast majority of the things they can do is emergent, meaning they were never specifically trained to do them.

j / k navigate · click thread line to collapse

0 comments

7 comments · 2 top-level

lodovic1y ago· 4 in thread

This is a really cool idea. Do you pretrain the model so it can tag people? I have so many photo's that it seems impossible to ever categorize them,using a workflow like yours might help a lot

ein0pOP1y ago

On some images where Gemma3 struggles Mistral Small produces better descriptions, BTW. But it seems harder to make it follow my instructions exactly.

I'm looking forward to the day when I can also do this with videos, a lot of which I also have no interest in uploading to someone else's computer.

mentalgear1y ago

Since you already seem to have done some impressive work on this for your personal use, would you mind open sourcing it?

ethersteeds1y ago

> No, tagging of people is already handled by another model.

As an aside, what model/tools do you prefer for tagging people?

fer1y ago

How do you use the keywords after? I have Immich running which does some analysis, but the querying is a bit of a hit and miss.

1 more reply

starik361y ago· 1 in thread

I was thinking of doing the same, but I would like to include people's name. in the description. For example "Jennifer looking out in the desert sky.".

As it stands, Gemma will just say "Woman looking out in the desert sky."

ein0pOP1y ago

Most search rankers do not consider word order, so if you could also append the person's name at the end of text description, it'd probably work well enough for retrieval and ranking at least.

j / k navigate · click thread line to collapse