I'm sure the first step is taking many sharper short exposure shots (as opposed to longer exposures, which blur), then doing some tensor magic to stitch into a single image.
Object detection alone won't give you sharp text in low light. You need a minimum number of photons hitting pixels.