Cool game though. I am still puzzled by how the probabilities are arrived at. Random?
Probabilities are mostly randomized during board generation but skewed in a way to make gameplay feel a bit better. There's a cap on the likelihood of the neutral event, and a bias towards the good event rather than a bad one.
To be specific, you could probably even leave the roll time as-is, to give you that suspense, but the time it takes to move the die to the center, flash it, and flash the result, is too long and gets irritating.
Or to turn this into a different game: the d20 stops fast by default, but extra click to cheat and keep it rolling if you feel that it's about to stop on an unfavorable face.
There are cycles in the board state graph, although they are of a very specific form (the only kind of cycle that exists is for board B with O and X alternating turns). So it is probably possible to make a completely deterministic and optimal algorithm for this probabilistic game, but it does sound complicated. You can't naively apply expectiminimax.
However after marking the winning board states as 0 or 1 respectively if O or X wins I would expect value iteration to very quickly converge to an optimal strategy here.
IDK if it's just me, but I went 6-0. Is something wrong with the computer player logic?
W(S, A) = P(good) * (1 - W(S_good)) + P(bad) * (1 - W(S_bad)) + (1 - P(good) - P(bad)) * (1 - W(S))
And obvisouly:
W(S) = max(W(S, A), foreach A in Actions)
max() is troublesome, but we can replace it with a >= sign:
W(S) >= (W(S, A), forall A in Actions)
And then we can expand W(S, A) and move W(S) in each of the inequalities to the left hand side. After all that we will have a bunch of linear inequalities which can be optimized with linear programming. I think the objective would just be:
<del>maximize</del> minimize W(empty_board)
Since chance is involved, you will basically never want to do anything but the greediest highest value next action. Sometimes more than half the board has net value of 0 or less which makes them very easy to ignore.
Since passing is not an option, you can't ignore a net value of 0 or less, because all options might have a net value of 0 or less.
I also managed to get the die stuck on a side with an edge pointing up, to where the game couldn't choose a face. I thought it was going to brick the game, but it detected this and re-rolled the die. Nice!
> Well, in any given game of Probabilistic Tic-Tac-Toe you can do everything right and still lose (or do everything wrong and win.) However, the better player always rises to the top over time.
> Bad breaks are inevitable, but good judgment is always rewarded (eventually, and given enough chances.)
This assumes that everyone is on a level playing field with only non-compounding randomness preventing the better player from winning. But as you point out, luck does compound over time:
>The parents we’re born to, societal power structures... so many past events have an invisible impact on each new action we take
This is commonly known as the rich get richer and the poor get poorer, and to economists as the Matthew Effect[1].
You could try to model this in the game by having wins skew the odds of the next game in your favor. It's harder to model in a simple two person game like this... You have to persist state for a population of players over time.
I've wanted to publish alternate rules for Monopoly, where at the start of the game players don't get the same amount of cash. Cash is instead distributed according to real statistics for "birth wealth". Alternatively, your cash at the end of a game roles over into the next game.
I'd love to discuss this with you if you are interested. We might even collaborate on a future project.
---
Old HN thread: https://news.ycombinator.com/item?id=12932183
Quick feature request: the die-roll is really cool, but can you make a lower-latency version so I can play more games in less time?
UI suggestion: show the probabilities for a move as a point in a triangle, with your outcome labels on the vertices. (Or maybe as red/green/neutral colors in the triangle's interior.) This representation is called the "probability simplex". It would look less busy, quicker to scan, I think.
The do nothing move is a nice touch
I think the AI should be optimized to not make plays that look obviously bad. It doesn't really need to be any harder, but it kinda ruins it when it makes a play that seems really obviously bad to me.
Also does it simply never play the center? It seems center is never an outlier probability but also feels like the AI should play it sometime. (edit: After 20 or so games it finally did. Maybe I was just overvaluing it? Although I'm winning about 80%.)
These suggestions are all about the feel of playing the AI rather than difficulty.
Link to HN submission: https://news.ycombinator.com/item?id=40653736
Any recommendations for creating simple 3d visualizations of orbiting spheres? Something like the one from the link, but more web-native (instead of python)?: https://trinket.io/glowscript/a90cba5f40
Also reminds me of how I was playing Senet last night. I controlled the game until the very end, where by chance, I kept rolling "bad" numbers and my opponent kept rolling "good" numbers.
Reminded me a bit of the Quantum Tic Tac Toe[1] game that I stumbled over some time ago.
For example, O should choose the lower right because it gives them a greater than 50% chance of winning, whereas choosing another spot gives them a greater than 50% chance of loosing
X X O
X _ O
_ X _
The dice roll animation is :chefkiss:
Got the same in chrome but it eventually loaded
Another interesting variant is incomplete information Tic Tac Toe which was posted by SMBC: https://www.smbc-comics.com/comic/incomplete
After thinking about it last night, I made a quick version this morning, and I think it's fun to play as well: https://eapl.me/incomplete/
See the following for a really nice tutorial for a slightly more advanced but more technically correct algorithm, Monte Carlo graph search (MCGS). This exploits the fact that some nodes in the game tree might are identical positions on the board and can be merged.
For your setup he could easily do either one, but the graph search might give you more mileage in the future:
github.com/lightvector/KataGo/blob/master/docs/GraphSearch.md
Once your scores have converged on the entire game tree, you can print out a crib sheet visually showing each position and the correct move. That might be the closest we can get to a human executable strategy. But the crib sheet might have strategic principles or hard rules that humans can identify
I found that once you naively search the game tree beyond a depth of 8, it more or less converges on a particular choice. The presence of a neutral outcome (i.e. neither player claims the selected square) means the tree depth is technically infinite, but feasible to search thoroughly once the first few moves have been played.
Can't you just multiply all the percentages and just get an expected value for each field? Why the random walk? Can't you just calculate this exhaustively?
Of course, you can use a number of algorithms to calculate the value, and if you beat me to the punch and it's correct, how about this I'll buy you a burger. But which specific algorithm are you proposing, and where is its pseudocode and correctness proof?
And is it simpler? If so I'll implement that instead of the random walk!
I picked the random walk algorithm because it is much easier to implement than any other game tree evaluation algorithm I know.