vixsomnis on Hacker News

I remember reading an article a year or so ago about (the NSA) identifying users based on how they write: vocabulary, spelling mistakes, grammar, dialect, and so on.

This is interesting to me because it is extremely difficult to change the vocabulary I use in writing and speaking. Being able to estimate the amount of similarity between two pieces of text would be useful.

The closest I can think of right now would be the proprietary algorithms used to check for plagiarism (for schools and universities, for instance).

Are there any publicly available algorithms for this? Where can I go to learn more? (Academic journals?) Am I just DDGing the wrong search terms?

92vixsomnis11y ago41

8

Bootstrapped Privacy Services: The One True Path (basically) (opens in new tab)

(cryptostorm.org)

1vixsomnis11y ago0

vixsomnis

Recent submissions

Story of a File (2007) (opens in new tab)

Economics of investing in information security (opens in new tab)

Ted Turner (2004) – “My Beef with Big Media” (opens in new tab)

How the way you type can shatter anonymity–even on Tor (opens in new tab)

Computers Can Now Replicate Handwriting (opens in new tab)

A better command line tool for the todo.txt format (opens in new tab)

Ask HN: Algorithms for text fingerprinting?

Bootstrapped Privacy Services: The One True Path (basically) (opens in new tab)

Recent submissions

Story of a File (2007) (opens in new tab)

Economics of investing in information security (opens in new tab)

Ted Turner (2004) – “My Beef with Big Media” (opens in new tab)

How the way you type can shatter anonymity–even on Tor (opens in new tab)

Computers Can Now Replicate Handwriting (opens in new tab)

A better command line tool for the todo.txt format (opens in new tab)

Ask HN: Algorithms for text fingerprinting?

Bootstrapped Privacy Services: The One True Path (basically) (opens in new tab)