Demo vid 1: https://youtu.be/F6k2PC-7WQw
Demo vid 2: https://youtu.be/-N2Qftl26Mw
Project repository: https://github.com/allisterb/OLAF
The raw data OLAF uses is logged in disparate places in one form or another (like in your PC or ISP or web server logs) and OLAF just tries to do real-time analysis of the available data in one place. No PII is ever intentionally logged and it is up to the org to link PC user accounts and identities to people.
Of course privacy is a big concern here and the organizations using a tool like OLAF have to walk the line between protecting the privacy of their users and adhering to the relevant laws vs. being able to quickly detect and investigate potentially serious threats to people's safety. Guidelines like those from NISO (https://www.niso.org/publications/privacy-principles) should be adhered to as much as possible.
Libraries currently used:
Tesseract.net: https://github.com/tvncosine/tesseract.net
VADERSharp: https://github.com/codingupastorm/vadersharp
Accord.NET: https://github.com/Accord-net/framework/
Azure Cognitive Services Computer Vision API
Azure Cognitive Services Text Analytics API
Let me know what you guys think and any comments or suggestions (or criticism) you may have.