Every successive incarnation of dataflow scheduling uses a matching store and synchronization tokens on long-latency operations. You have them in Tomasulo's reservation stations at a very small scale (considering FUs as unpredictable), in the MTA throughout the memory system, in the D-RISC core of my research group, and quite a few others.
It's quite a common and recurrent idea really. However promises / dataflow tokens / I-structures / etc all are subject to a common flaw / problem: when you receive multiple completions simultaneously, which of them are you going to schedule first? This choice is highly non-obvious and has tremendous impacts on data locality.