@perl4ever is being much more explicit: the stress caused directly by the exam setup making it very explicit that they are considering race when grading.
Those are wildly different.
My presumption on the concept of stereotype threat is that it isn’t “real” - these studies find statistical evidence based on “priming” which was super hot for a decade or two before it was discovered that it was complete nonsense (it was a significant contributor to the start of the “replication crisis” in psych+soc).
Eg if your study finds a statistical significance based on assuming priming works, but we know priming doesn’t, then your study, statistics, or both, are bad.