Author here. I was looking for a Docker-based server, which can expose a simple endpoint to generate vector embeddings for documents. The solution needs to deal with lengthy documents that exceed the 512-token limit enforced by E5 models. Such documents require intelligent chunking, ideally at sentence boundaries, followed by taking a mean of the vectors, to work effectively. Since I couldn't find a solution that met these criteria, I decided to create this setup myself.