1Our evaluation of OpenAI's GPT-5.5 cyber capabilities (opens in new tab)(aisi.gov.uk)2Cynddl1mo ago0Save
2Making AI chatbots friendly leads to mistakes and support of conspiracy theories (opens in new tab)(theguardian.com)93Cynddl1mo ago80Save
3UK Biobank health data keeps ending up on GitHub (opens in new tab)(biobank.rocher.lc)197Cynddl2mo ago57Save
4ChatGPT Edu feature reveals researchers' project metadata across universities (opens in new tab)(fastcompany.com)2Cynddl3mo ago0Save
5AI no better than other methods for patients seeking medical advice, study shows (opens in new tab)(reuters.com)3Cynddl4mo ago0Save
6AI chatbots pose 'dangerous' risk when giving medical advice, study suggests (opens in new tab)(bbc.co.uk)4Cynddl4mo ago2Save
7Show HN: Small, anonymous app for teams to do retrospective sessions (opens in new tab)(retrospective.rocher.lc)1Cynddl4mo ago0Save
8Measuring What Matters: Construct Validity in Large Language Model Benchmarks (opens in new tab)(arxiv.org)arXiv1Cynddl7mo ago0Save
9AI Capabilities May Be Overhyped on Bogus Benchmarks, Study Finds (opens in new tab)(gizmodo.com)43Cynddl7mo ago17Save
10AI's capabilities may be exaggerated by flawed tests, according to new study (opens in new tab)(nbcnews.com)3Cynddl7mo ago0Save
11Experts find flaws in tests that check AI safety and effectiveness (opens in new tab)(theguardian.com)3Cynddl7mo ago0Save
12Measuring What Matters: Construct Validity in Large Language Model Benchmarks (opens in new tab)(oxrml.com)3Cynddl7mo ago2Save
14Facial recognition works better in the lab than on the street, researchers show (opens in new tab)(theregister.com)4Cynddl10mo ago1Save
15We Shouldn't Trust Facial Recognition's Glowing Test Scores (opens in new tab)(techpolicy.press)2Cynddl10mo ago0Save