I am have a set of pdf that are bank statements. The formats of these statements are different based on the bank but they are limited set (<15). What's the current best approach to extract tabular data from PDFs? I tried writing custom logic based on pdfplumber and such but they are very fragile and have lots of ad-hoc logic. The maintenance is pretty high. Are there small models that can run preferably on CPUs alone and that I can possibly fine tune for this task? Any guides or pointers for that? I see a lot of available models, but as someone with no ML background, it's difficult to navigate through.
I manage a windows server machine for a small business. I do it part time for my brother's business. I've enabled usual protection suggested by Microsoft like regular scan, backup, write protection in certain folders with whitelist etc. Are there FOSS tools that I can use to manage, monitor and secure Windows machine. His office also has 7-8 windows laptops connecting to this "server" which is just a powerful AMD desktops class machine with good storage and memory. How can I manage this small fleet and secure them also?