Player of Games (opens in new tab)

(arxiv.org)

364 pointsvatueil4y ago231 comments

231 comments

105 comments · 22 top-level

captn3m04y ago· 14 in thread

If you are interested in this, I maintain a list of boardgame-solving related research at https://github.com/captn3m0/boardgame-research, with sections for specific games.

This looks really interesting. It would be a good project to test this against a general card-playing framework to easily test it on a variety of imperfect-information games based on playing cards.

fho4y ago

I tried my hand once or twice at (re-)implementing board games [0], so that I could run some common "AI" algorithms on the game trees.

What tripped me up every time is that most board games have a lot of "if this happens, there is this specific rule that applies". Even relatively simple games (like Homeworlds) are pretty hard to nail down perfectly due to all the special cases.

Do you, or somebody else, have any recommendations on how to handle this?

[0] Dominion, Homeworlds and the battle part of Eclipse iirc.

anonymoushn4y ago

Dominion and Homeworlds are pretty complicated! Maybe you can start with a simpler game like Splendor.

In my 2-player Splendor rules engine, the following actions are possible:

1. Purchase a holding. (90 possible actions, one for each holding)

2. If you do not have 3 reserved cards, reserve a card and take a gold chip if possible. (93 possible actions, one for each holding and one for each deck of facedown cards)

3. If there are 4 chips of the same color in a pile, take 2 chips of that color. (5 possible actions)

4. Take 3 chips of different colors, or 2 chips of different colors if only 2 are available, or 1 chip if only 1 is available. (25 possible actions)

5. If after any action you have at least 11 chips, return 1 chip. (6 possible actions which are never legal at the same time as any other actions)

This still doesn't correctly implement the rules though. In the actual game, you'd be allowed to spend gold chips when you don't need to, which would make purchasing holdings contain extra decisions after you pick which holding to purchase about which chips you'd like to keep.

1 more reply

captn3m04y ago

+1 to boardgame.io. It provides very good abstractions for turns, phases, players, and partial information. I’ve implemented small games with a few hours of effort, and that includes a UI.

1 more reply

iwd4y ago

If you’re doing it for fun, one option is to start with a simplified version of the game. It’s faster to implement and faster to run. And you’ll get insights you can apply to the full game.

That’s what I did when I applied RL to Dominion, because the complexity of the game depends heavily on the cards you include! See part 3 of https://ianwdavis.com/dominion.html

LeifCarrotson4y ago

> What tripped me up every time is that most board games have a lot of "if this happens, there is this specific rule that applies". Even relatively simple games (like Homeworlds) are pretty hard to nail down perfectly due to all the special cases.

The key is to build a data-driven state machine, rather than writing logic with a bunch of 'if' statements.

2 more replies

nicolodavis4y ago

You could consider using a library like boardgame.io for this.

1 more reply

mathgladiator4y ago

I'd appreciate you checking out my language and providing feedback. An element that helps is building a stateful server and using streams where the people behave like servers:

http://www.adama-lang.org/

JoeDaDude4y ago

Thank you for posting! Maybe you can include the game of Arimaa [1]. Arimaa was designed to be hard(er) for computers and level the playing field for humans. Algorithms were developed eventually, though I have not kept up to know where that stands today.

[1]. https://en.wikipedia.org/wiki/Arimaa

captn3m04y ago

Arima has enough research that it’s covered in the Wikipedia section[0] as well as the Chess Programming Wiki[1], which is linked in the README. I’m specifically trying to collect research on contemporary games, which are not so easily available. Chess/Go and alike games are very covered already, however imperfect information games are much rarer for eg.

[0]: https://en.m.wikipedia.org/wiki/Computer_Arimaa

[1]: https://www.chessprogramming.org/Arimaa

mathgladiator4y ago

Thanks for this! I'm currently designing a language for complex board games like Battlestar Galactica: http://www.adama-lang.org/

Something that I found amazing was inverting the flow control such that the server asks players questions with a list of possible choices simplifies the agent design tremendously. As I'm looking to retire to work on this project, I can generate the agent code and then hand-craft an AI. However, some AIs are soooo hard to even conceptualize.

majani4y ago

Imperfect information games will always have a luck element that gives casual players an edge. That's basically the appeal of card games over board games.

ketzo4y ago

And why so many board games incorporate decks/hands of cards.

mathgladiator4y ago

Not just luck but deception as well which takes some games to new levels.

alper1114y ago

This looks very good, thanks.

sfkgtbor4y ago· 14 in thread

I really like seeing references to the Culture series when naming things:

https://en.m.wikipedia.org/wiki/The_Player_of_Games

CobrastanJorji4y ago

Allusions are fun and all, but I disagree. These are important problems that a lot of people have put their whole careers into researching. Silly names like these lack gravitas.

sjg17294y ago

Always sad to see these projects suffer from A Shortfall of Gravitas

3 more replies

moritonal4y ago

Sorry, to explain the joke. The ships name themselves, and when they pick jokey names they're often mocked by the humans (which are in every way essentially ants to the spaceships) for not having enough gravitas. So the ships start naming themeselves things like the "Death-ray 9000 super-killer deluxe", to essentially take the piss.

Funnily enough you can see the exact same effect in principal game-engineers or computer-hacking.

1 more reply

ZeroGravitas4y ago

Very little Gravitas Indeed.

1 more reply

0_gravitas4y ago

indeed

gremloni4y ago

If anything the caliber and lore of the series gives the project an incredible amount of gravitas. Plus the scheme is just plain beautiful in my opinion.

1 more reply

doctor_eval4y ago

I suppose it's better than "Use of Weapons".

OneTimePetes4y ago

Why not have a seat, take that chair over there.

1 more reply

dane-pgp4y ago

I think it is also a reference to "PogChamp", although it's disappointing that PoG apparently wasn't evaluated against the Arcade Learning Environment (ALE) corpus of Atari 2600 games.

abledon4y ago

much more refined to think a spam of "POG!" stands for Player of Games when reading twitch chat

Borrible4y ago

Banks should have named one of Culture's General System Vehicles 'Don't be Evil'.

https://theculture.fandom.com/wiki/List_of_spacecraft

hoseja4y ago

Kinda ironic since in the novel, a human player is better than the strong AI (albeit a little inexplicably).

pharmakom4y ago

No he is not, but AIs are not allowed in the competition the story centers around.

1 more reply

7thaccount4y ago

I thought the protagonist wasn't nearly as talented as the culture AIs (even the ones that are not all that powerful)?

2 more replies

sdenton44y ago· 12 in thread

This is clearly part of DeepMind's long-game plan to achieve world domination through board game mastery. Naming the new algorithm after the book is a real tip of their hand...

https://en.wikipedia.org/wiki/The_Player_of_Games

sillysaurusx4y ago

The abbreviation is PoG too. I bet that was totally on purpose. At least one person in Brain is a dota player, so you better believe they watch twitch.

Funny that most of the comments are about the name. What an excellent choice.

chrisweekly4y ago

PSA: The "Culture" novels by Iain M Banks are fantastic and can be read in any order. "Player of Games" was the 1st one I read and still probably my favorite.

bduerst4y ago

Player of Games is the second book, and the one I recommend people start The Culture series with.

The first book Consider Phlebas isn't bad, but it isn't as well developed as the rest of the series IMO.

1 more reply

bewaretheirs4y ago

I keep hearing recommendations for the Culture books so I tried reading it recently and it just didn't work for me -- I gave up on it halfway through, which is rare for me.

3 more replies

kmtrowbr4y ago

Yes! I love this one. It's my favorite too.

7thaccount4y ago

Pretty amazing book. I wish I could play a board game like that as well.

automatic61314y ago

I always imagine the board game as essentially being SM's Civilisation but really, really good in an indescribable way - with some card games inbetween.

1 more reply

stavros4y ago

I second this, it was excellent. I've only read a few Banks books, but this was my favorite.

1 more reply

WithinReason4y ago

"In 2015, two SpaceX autonomous spaceport drone ships—Just Read the Instructions and Of Course I Still Love You—were named after ships in the book, as a posthumous tribute to Banks by Elon Musk"

omnicognate4y ago

Shame they didn't go with Pure Big Mad Boat Man.

65104y ago

The end game is pinball and we are the balls.

zeristor4y ago

We are the pins.

tsbinz4y ago· 11 in thread

Comparing against Stockfish 8 in a paper released today and labeling it as "Stockfish" is bordering on being dishonest. The current stockfish version (14) would make AlphaZero look bad, so they don't include it ...

dontreact4y ago

The name of the game here is generality. For a really general agent, they are looking to have superhuman performance, not get state of the art on every individual task. Beating stockfish 8 convinces me that it would be superhuman at chess.

remram4y ago

They could still be honest that it's Stockfish 8, not the Stockfish everyone has. Your product having genuine value does not excuse lying about that value.

2 more replies

ShamelessC4y ago

The first mention says "Stockfish 8, level 20" in the paper. This isn't a blog post that you can skim, you need to read the whole thing before critiquing.

karpierz4y ago

That's actually the second mention, the first is when they introduce the games in section 4:

> Today, computer- playing programs remain consistently super-human, and one of the strongest and most widely-used programs is Stockfish.

They also go back to referring to it as Stockfish for the rest of the paper.

An analogous situation in my mind would be if AMD released a new CPU and benchmarked it against an Intel CPU, only mentioning once, somewhere in the middle of the paper, that it was a Pentium 4.

4 more replies

tsbinz4y ago

I obviously read it, otherwise I wouldn't have known which version they are using. They are banking on others, that do just skim the figures and tables, not noticing their usage of outdated baselines.

1 more reply

david_draco4y ago

Isn't the point comparing traditional heuristic techniques against DNN-learned techniques? I understand the latest Stockfish is etching quite close to AlphaZero techniques, but maybe I am wrong.

tsbinz4y ago

It does have the option to use a neural network (nnue) in its evaluation, but it is very different from what AlphaZero/Lc0 do. You can choose not to use it, so you still could have a "traditional" evaluation (which would still blow Stockfish 8 out of the water). Also, Stockfish 8 isn't the last version before they merged nnue ...

moondistance4y ago

The abstract clearly states that the best chess and Go bots are not beaten: "Player of Games reaches strong performance in chess and Go, beats the strongest openly available agent in heads-up no-limit Texas hold’em poker (Slumbot)..."

nixed4y ago

the same goes for slumbot in poker, its super old like 2013, the game is played completely different now and current bots would destroy it.

bluecalm4y ago

The problem with poker is that there is money to be made from having a strong AI so there is 0 incentive to release it. What's publicly available are solvers (which solve game abstractions similar to the full game but don't play themselves) and shitty bots.

scrozart4y ago

As a commenter above noted, this work is about generality, being able to play every game, and not being the best at every game.

2 more replies

BeenChilling4y ago· 8 in thread

I want to see deepmind make a bot to play team based first person shooters like csgo and rainbow6 siege, to stack up five of them against a team of professional players.

fho4y ago

Honestly that probably won't be too interesting as (a) one AI could perfectly control several agents (ie perfect coordination of global strategies) and (b) an AI has low to no reaction times and perfect aim (aimbots already have that) so I would expect that would quickly result in a slaughterfest.

arlort4y ago

What would be interesting would be 5 independent AIs (even just different instances of the same AI of course) using the same interface as human players, so the same controls and the same video output

I am pretty sure aimbots access the internals of the game rather than reading the video output to identify the silhouette of the enemy.

ausbah4y ago

IIRC multi-agent domains are in their own category specifically because a single agent posing as "multiple agents" usually can't solve such environments, you need multiple agents with varying degrees of dependence

gverrilla4y ago

Same applies to dota2, and it was very interesting what they did there. But yeah first they would need to simulate how human players react and aim, or it would be impossible to play against.

LudwigNagasena4y ago

(a) make them independent (b) add 100-200ms delay

arethuza4y ago

"...such consummate skill, such ability, such adaptability, such numbing ruthlessness, such a use of weapons when anything could become weapon..."

ausbah4y ago

that's what OpenAI did a couple yewrs ago with Dota 2

https://openai.com/five/

mensetmanusman4y ago

They probably won’t for publicity reasons.

mudlus4y ago· 6 in thread

Yawn, show me a computer that game make fun games

TaupeRanger4y ago

You're getting downvotes but honestly I agree. Who cares about board games? We should've moved on from this once we "solved" chess and Go. There are more important things and it's not remotely surprising that a computer can beat a human when there's a simple, abstract optimization problem to throw computing power at. Make it creative...now that's a challenge worthy of the top AI talent.

newswasboring4y ago

I agree. I have always wondered if I can feed GPT-3 a bunch of rule books and ask it to generate game rules.

kadoban4y ago

You haven't seen AlphaGo play Go then, it plays creatively as hell at points.

1 more reply

Buttons8404y ago

Solving the game comes before solving for fun. If we create an AI that can win, then we can hamper the AI in fun ways, or give it an altered objective function that maximizes the players fun.

mbrodersen4y ago

Yes indeed. AI research will only take a real step forward when it learns how to be creative instead of just very good at optimising simple formal systems like board games.

baq4y ago

if making games is a game...

fxtentacle4y ago· 5 in thread

This is a great result, but you can see that it's more of a theoretical case because of this: "converging to perfect play as available computation time and approximation capacity increases." That is true for pretty much all current deep reinforcement learning algorithms.

The practical question is: How much computation do you need to get useful results? Alpha Go Zero is impressive mathematics, but who is willing to spend $1mio daily for months to train it? IMPALA (another Google one) can learn almost all Atari games, but you need a head node with 256 TPU cores and 1000+ evaluation workers to replicate the timings from the paper.

sillysaurusx4y ago

You often don't need anywhere near the amount of compute in these papers to get similar performance.

Suppose you're a business that needs to play games. Most people seem to think that it's a matter of plugging in the settings from the paper, buying the same hardware, then clicking a button and waiting.

It's not. The specific settings matter a lot.

But my main point is that you'll get most of your performance pretty rapidly. The only reason to leave it running for so long is to get that last N%, which is nice for benchmarks but not for business.

DeepMind overspends. Actually, they don't; they're not paying anywhere close to the price of a 256 core TPU. (Many external companies aren't, either, and you can get a good deal by negotiating with the Cloud TPU team.)

But you don't need a 256 core TPU. Lots of times, these algorithms simply do not require the amount of compute that people throw at the problem.

On the other hand, you can also usually get access to that kind of compute. A 256 core TPU isn't beyond reach. I'm pretty sure I could create one right now. It's free, thanks to TFRC, and you yourself can apply (and be approved). I was. https://sites.research.google/trc/

It kills me that it's so hard to replicate these papers, which is most of the motivation for my comment here. Ultimately, you're right: "How much compute?" is a big unknown. But the lower bound is much lower than most people realize (and most researchers).

fxtentacle4y ago

My personal experience was the opposite. I'm currently trying different approaches for building a Bomberman AI for the Bomberland competition that was discussed here on HN a few weeks ago.

"IMPALA with 1 learner takes only around 10 hours to reach the same performance that A3C approaches after 7.5 days." says the paper, but I can run A3C on a cheap CPU-only server but to get that IMPALA timing, I need to spend a lot of money. But my biggest roadblock so far is that I need compute far exceeding what the papers claim.

The diagrams for IMPALA show good performance starting at 1e8 environment frames and excellent performance at 1e9 frames. By now, I'm at 2.5e9 frames and performance is still bad. In my opinion, the reason is that the sequence lengths for Bomberland are quite long. To clear a path, you place a bomb, wait 5 ticks for it to become detonatable, then detonate it, then wait 10 ticks for the fire to clear. With 7 possible actions per tick, the chance of randomly executing this 17 tick sequence becomes (1/7)^17 = 4e-15. If I calculate optimistically that all moves are valid, too, while we wait, then I can get up to (1/7)(5/7)^5(1/7)*(5/7)^10 = 1e-4. But that still means that at 1e8 env steps, I only have 1000 successful executions to learn from.

3 more replies

loxias4y ago

My thoughts, not being in the field, are parallel to the parent post. "It's nice and all that we're achieving better and better computer performance at things that used to require the human brain, but it seems we're doing so by building larger and larger computers."Not to detract from that achievement, I love large computers in their own right!

I'm a dabbler in Go, and "somewhere below professional" at the game of poker. I've followed the advances in the latter for more than a decade, eagerly reading every paper the CPRG publishes. They use a LOT of compute power!

I know from experience that "The specific settings matter a lot.". For several years, I made my living "implementing papers for hire". It's real work, no argument there. Sometimes the settings are the solution, and heck, sometimes the published algorithm is outright wrong, and you only discover so when trying to implement it.

But the second part of your point, that it's not simply achieving more performance by throwing more transistors at it, I don't have experience with, and I sorta don't believe you. :)

Your comment is quite well written, making me (irrationally?) predisposed to suspect you're correct on factual matters, or at least more of a domain expert than I. Can you cite sources, or simply elaborate more?

1 more reply

gwern4y ago

> That is true for pretty much all current deep reinforcement learning algorithms.

Is that true? I was unaware that PPO, SAC, DQN, Impala, MuZero/AlphaZero etc would all automatically Just Work™ for hidden information games. Straight MCTS-inspired algorithms seem like they'd fail for reasons discussed in the paper, and while PPO/Impala work reasonably well in DoTA2/SC2, it's not obvious they'd converge to perfect play.

fxtentacle4y ago

You can mathematically prove for a lot of different algorithms (including PPO, DQN, IMPALA) that given enough experience with the game world, they will eventually converge to the optimal policy. It's just that the "enough experience" part might be so large that it's practically useless.

If I remember correctly, the DeepMind x UCL RL Lecture Series proves the underlying Bellman equation in this video: https://www.youtube.com/watch?v=zSOMeug_i_M

As for "hidden information" games, I thought the trick was to concatenate the current state with all past states and treat that as the new state, thereby making it an MDP.

1 more reply

WilliamDampier4y ago· 4 in thread

so this is what Grimes latest song is about?

3234y ago

> All the lyrical evidence that Grimes’ new song ‘Player of Games’ is about ex Elon Musk

> Grimes seemingly makes multiple, thinly veiled references to Musk in the song

https://www.independent.co.uk/arts-entertainment/music/news/...

cwkoss4y ago

SpaceX's landing pad barges are also named after Culture series starships

junon4y ago

Yeah wtf, was my first thought. This is mind blowing if true.

junon4y ago

Actually she probably got the name from the sci-fi novel this is named after:

https://en.m.wikipedia.org/wiki/The_Player_of_Games

wly_cdgr4y ago· 3 in thread

The future is so depressing

wetpaws4y ago

Fun fact: The consensus between professional go and chess players is that all new AI systems (alphago, etc) have really revitalised the game and introduced incredible amount of new strategies and depth.

loxias4y ago

I wish alphago was more "democratized" -- that is to say, I have many questions and experiments I'd love to run on it (a friend of mine and I have frequently pondered Go played in various different topological spaces, and I'd love to see an AI's result, for example).

2 more replies

jart4y ago

Sad fact: Lee Sedol retired after AlphaGo defeated him.

2 more replies

bkartal4y ago· 2 in thread

Impressive work! Most authors, if not all, are from DeepMind Edmonton office.

cmauniada4y ago

I didn’t even know that they had an office in Edmonton...

bkartal4y ago

Edmonton is one of the best places for RL research & ecosystem, both DeepMind and University of Alberta are there.

1 more reply

SuoDuanDao4y ago· 1 in thread

I didn't even know about the book until I read the comments here, I thought it was a reference to the Grimes song. Funny coincidence the song and the engine would appear so close in time to one another.

Severian4y ago

The Grimes song is a reference to the book too. She also has Marain subtitles in her video for "Idoru", which is the language used in The Culture. Weird mix of two author's (Idoru being William Gibson) works to be sure.

ArtWomb4y ago· 1 in thread

This seems like a significant milestone in AI. I mean what can't an agent with mastery of "guided search, learning, and game-theoretic reasoning" accomplish?

ausbah4y ago

modeling every task as a game seems like a big hurdle, or even just getting a working "environment"

pixelpoet4y ago· 1 in thread

Anyone else surprised to see that Demis Hassabis didn't have a hand in this research? Given his background as a player of many games, and involvement in a lot of their research.

thomasahle4y ago

I'm more surprised David Silver isn't on it, since his background is in imperfect information games, with papers such as https://arxiv.org/abs/1603.01121 He did multiple poker papers before he was the main author of Alpha Zero.

skinner_4y ago· 1 in thread

It would be awesome to have two interacting communities: AI experts building open source general game playing engines, and gaming fans writing pluggable rule specifications and UIs for popular games.

A bit of googling shows that there is a General Game Playing AI community with their own Game Description Language. I never really encountered them before, and the DeepMind paper does not cite them, either.

dpflug4y ago

Last I looked, the GGP community is focused on perfect information games currently. I had the same thought, though.

hervature4y ago

I think this is a good step forward that generalizes an algorithm to play both perfect and imperfect information games. However, table 9 shows (I believe it shows, it is not the most intuitive form), that other AIs (Deepstack, ReBeL, and Supremus) eat its lunch at poker. It also performs worse than AlphaZero at perfect information games. So, while a nice generalizing framework, probably will not be what you use in practice.

cab4044y ago

SCP-like name for SCP-like neural network.

"SCP-29123 Player Of Games"

wiz21c4y ago

Couldn't resist :

https://www.youtube.com/watch?v=-1F7vaNP9w0

antonpuz4y ago

Anyone knows whether the agent is publicly available?

simonebrunozzi4y ago

Can this be realistically used by game companies to provide a much better AI experience for strategy games?

crhutchins4y ago

I'll try to look into a brighter light into this one.

RivieraKid4y ago

Wow, it can beat a good poker bot, that is impressive.

loxias4y ago

Psh, wake me when it can play Mao. ;)

j / k navigate · click thread line to collapse

231 comments

105 comments · 22 top-level

captn3m04y ago· 14 in thread

If you are interested in this, I maintain a list of boardgame-solving related research at https://github.com/captn3m0/boardgame-research, with sections for specific games.

This looks really interesting. It would be a good project to test this against a general card-playing framework to easily test it on a variety of imperfect-information games based on playing cards.

fho4y ago

I tried my hand once or twice at (re-)implementing board games [0], so that I could run some common "AI" algorithms on the game trees.

Do you, or somebody else, have any recommendations on how to handle this?

[0] Dominion, Homeworlds and the battle part of Eclipse iirc.

anonymoushn4y ago

Dominion and Homeworlds are pretty complicated! Maybe you can start with a simpler game like Splendor.

In my 2-player Splendor rules engine, the following actions are possible:

1. Purchase a holding. (90 possible actions, one for each holding)

2. If you do not have 3 reserved cards, reserve a card and take a gold chip if possible. (93 possible actions, one for each holding and one for each deck of facedown cards)

3. If there are 4 chips of the same color in a pile, take 2 chips of that color. (5 possible actions)

4. Take 3 chips of different colors, or 2 chips of different colors if only 2 are available, or 1 chip if only 1 is available. (25 possible actions)

5. If after any action you have at least 11 chips, return 1 chip. (6 possible actions which are never legal at the same time as any other actions)

1 more reply

captn3m04y ago

+1 to boardgame.io. It provides very good abstractions for turns, phases, players, and partial information. I’ve implemented small games with a few hours of effort, and that includes a UI.

1 more reply

iwd4y ago

If you’re doing it for fun, one option is to start with a simplified version of the game. It’s faster to implement and faster to run. And you’ll get insights you can apply to the full game.

That’s what I did when I applied RL to Dominion, because the complexity of the game depends heavily on the cards you include! See part 3 of https://ianwdavis.com/dominion.html

LeifCarrotson4y ago

The key is to build a data-driven state machine, rather than writing logic with a bunch of 'if' statements.

2 more replies

nicolodavis4y ago

You could consider using a library like boardgame.io for this.

1 more reply

mathgladiator4y ago

I'd appreciate you checking out my language and providing feedback. An element that helps is building a stateful server and using streams where the people behave like servers:

http://www.adama-lang.org/

JoeDaDude4y ago

[1]. https://en.wikipedia.org/wiki/Arimaa

captn3m04y ago

[0]: https://en.m.wikipedia.org/wiki/Computer_Arimaa

[1]: https://www.chessprogramming.org/Arimaa

mathgladiator4y ago

Thanks for this! I'm currently designing a language for complex board games like Battlestar Galactica: http://www.adama-lang.org/

majani4y ago

Imperfect information games will always have a luck element that gives casual players an edge. That's basically the appeal of card games over board games.

ketzo4y ago

And why so many board games incorporate decks/hands of cards.

mathgladiator4y ago

Not just luck but deception as well which takes some games to new levels.

alper1114y ago

This looks very good, thanks.

sfkgtbor4y ago· 14 in thread

I really like seeing references to the Culture series when naming things:

https://en.m.wikipedia.org/wiki/The_Player_of_Games

CobrastanJorji4y ago

Allusions are fun and all, but I disagree. These are important problems that a lot of people have put their whole careers into researching. Silly names like these lack gravitas.

sjg17294y ago

Always sad to see these projects suffer from A Shortfall of Gravitas

3 more replies

moritonal4y ago

Funnily enough you can see the exact same effect in principal game-engineers or computer-hacking.

1 more reply

ZeroGravitas4y ago

Very little Gravitas Indeed.

1 more reply

0_gravitas4y ago

indeed

gremloni4y ago

If anything the caliber and lore of the series gives the project an incredible amount of gravitas. Plus the scheme is just plain beautiful in my opinion.

1 more reply

doctor_eval4y ago

I suppose it's better than "Use of Weapons".

OneTimePetes4y ago

Why not have a seat, take that chair over there.

1 more reply

dane-pgp4y ago

I think it is also a reference to "PogChamp", although it's disappointing that PoG apparently wasn't evaluated against the Arcade Learning Environment (ALE) corpus of Atari 2600 games.

abledon4y ago

much more refined to think a spam of "POG!" stands for Player of Games when reading twitch chat

Borrible4y ago

Banks should have named one of Culture's General System Vehicles 'Don't be Evil'.

https://theculture.fandom.com/wiki/List_of_spacecraft

hoseja4y ago

Kinda ironic since in the novel, a human player is better than the strong AI (albeit a little inexplicably).

pharmakom4y ago

No he is not, but AIs are not allowed in the competition the story centers around.

1 more reply

7thaccount4y ago

I thought the protagonist wasn't nearly as talented as the culture AIs (even the ones that are not all that powerful)?

2 more replies

sdenton44y ago· 12 in thread

This is clearly part of DeepMind's long-game plan to achieve world domination through board game mastery. Naming the new algorithm after the book is a real tip of their hand...

https://en.wikipedia.org/wiki/The_Player_of_Games

sillysaurusx4y ago

The abbreviation is PoG too. I bet that was totally on purpose. At least one person in Brain is a dota player, so you better believe they watch twitch.

Funny that most of the comments are about the name. What an excellent choice.

chrisweekly4y ago

PSA: The "Culture" novels by Iain M Banks are fantastic and can be read in any order. "Player of Games" was the 1st one I read and still probably my favorite.

bduerst4y ago

Player of Games is the second book, and the one I recommend people start The Culture series with.

The first book Consider Phlebas isn't bad, but it isn't as well developed as the rest of the series IMO.

1 more reply

bewaretheirs4y ago

I keep hearing recommendations for the Culture books so I tried reading it recently and it just didn't work for me -- I gave up on it halfway through, which is rare for me.

3 more replies

kmtrowbr4y ago

Yes! I love this one. It's my favorite too.

7thaccount4y ago

Pretty amazing book. I wish I could play a board game like that as well.

automatic61314y ago

I always imagine the board game as essentially being SM's Civilisation but really, really good in an indescribable way - with some card games inbetween.

1 more reply

stavros4y ago

I second this, it was excellent. I've only read a few Banks books, but this was my favorite.

1 more reply

WithinReason4y ago

"In 2015, two SpaceX autonomous spaceport drone ships—Just Read the Instructions and Of Course I Still Love You—were named after ships in the book, as a posthumous tribute to Banks by Elon Musk"

omnicognate4y ago

Shame they didn't go with Pure Big Mad Boat Man.

65104y ago

The end game is pinball and we are the balls.

zeristor4y ago

We are the pins.

tsbinz4y ago· 11 in thread

dontreact4y ago

remram4y ago

They could still be honest that it's Stockfish 8, not the Stockfish everyone has. Your product having genuine value does not excuse lying about that value.

2 more replies

ShamelessC4y ago

The first mention says "Stockfish 8, level 20" in the paper. This isn't a blog post that you can skim, you need to read the whole thing before critiquing.

karpierz4y ago

That's actually the second mention, the first is when they introduce the games in section 4:

> Today, computer- playing programs remain consistently super-human, and one of the strongest and most widely-used programs is Stockfish.

They also go back to referring to it as Stockfish for the rest of the paper.

An analogous situation in my mind would be if AMD released a new CPU and benchmarked it against an Intel CPU, only mentioning once, somewhere in the middle of the paper, that it was a Pentium 4.

4 more replies

tsbinz4y ago

1 more reply

david_draco4y ago

Isn't the point comparing traditional heuristic techniques against DNN-learned techniques? I understand the latest Stockfish is etching quite close to AlphaZero techniques, but maybe I am wrong.

tsbinz4y ago

moondistance4y ago

nixed4y ago

the same goes for slumbot in poker, its super old like 2013, the game is played completely different now and current bots would destroy it.

bluecalm4y ago

scrozart4y ago

As a commenter above noted, this work is about generality, being able to play every game, and not being the best at every game.

2 more replies

BeenChilling4y ago· 8 in thread

I want to see deepmind make a bot to play team based first person shooters like csgo and rainbow6 siege, to stack up five of them against a team of professional players.

fho4y ago

arlort4y ago

What would be interesting would be 5 independent AIs (even just different instances of the same AI of course) using the same interface as human players, so the same controls and the same video output

I am pretty sure aimbots access the internals of the game rather than reading the video output to identify the silhouette of the enemy.

ausbah4y ago

gverrilla4y ago

Same applies to dota2, and it was very interesting what they did there. But yeah first they would need to simulate how human players react and aim, or it would be impossible to play against.

LudwigNagasena4y ago

(a) make them independent (b) add 100-200ms delay

arethuza4y ago

"...such consummate skill, such ability, such adaptability, such numbing ruthlessness, such a use of weapons when anything could become weapon..."

ausbah4y ago

that's what OpenAI did a couple yewrs ago with Dota 2

https://openai.com/five/

mensetmanusman4y ago

They probably won’t for publicity reasons.

mudlus4y ago· 6 in thread

Yawn, show me a computer that game make fun games

TaupeRanger4y ago

newswasboring4y ago

I agree. I have always wondered if I can feed GPT-3 a bunch of rule books and ask it to generate game rules.

kadoban4y ago

You haven't seen AlphaGo play Go then, it plays creatively as hell at points.

1 more reply

Buttons8404y ago

Solving the game comes before solving for fun. If we create an AI that can win, then we can hamper the AI in fun ways, or give it an altered objective function that maximizes the players fun.

mbrodersen4y ago

Yes indeed. AI research will only take a real step forward when it learns how to be creative instead of just very good at optimising simple formal systems like board games.

baq4y ago

if making games is a game...

fxtentacle4y ago· 5 in thread

sillysaurusx4y ago

You often don't need anywhere near the amount of compute in these papers to get similar performance.

It's not. The specific settings matter a lot.

But you don't need a 256 core TPU. Lots of times, these algorithms simply do not require the amount of compute that people throw at the problem.

fxtentacle4y ago

My personal experience was the opposite. I'm currently trying different approaches for building a Bomberman AI for the Bomberland competition that was discussed here on HN a few weeks ago.

3 more replies

loxias4y ago

But the second part of your point, that it's not simply achieving more performance by throwing more transistors at it, I don't have experience with, and I sorta don't believe you. :)

1 more reply

gwern4y ago

> That is true for pretty much all current deep reinforcement learning algorithms.

fxtentacle4y ago

If I remember correctly, the DeepMind x UCL RL Lecture Series proves the underlying Bellman equation in this video: https://www.youtube.com/watch?v=zSOMeug_i_M

As for "hidden information" games, I thought the trick was to concatenate the current state with all past states and treat that as the new state, thereby making it an MDP.

1 more reply

WilliamDampier4y ago· 4 in thread

so this is what Grimes latest song is about?

3234y ago

> All the lyrical evidence that Grimes’ new song ‘Player of Games’ is about ex Elon Musk

> Grimes seemingly makes multiple, thinly veiled references to Musk in the song

https://www.independent.co.uk/arts-entertainment/music/news/...

cwkoss4y ago

SpaceX's landing pad barges are also named after Culture series starships

junon4y ago

Yeah wtf, was my first thought. This is mind blowing if true.

junon4y ago

Actually she probably got the name from the sci-fi novel this is named after:

https://en.m.wikipedia.org/wiki/The_Player_of_Games

wly_cdgr4y ago· 3 in thread

The future is so depressing

wetpaws4y ago

loxias4y ago

2 more replies

jart4y ago

Sad fact: Lee Sedol retired after AlphaGo defeated him.

2 more replies

bkartal4y ago· 2 in thread

Impressive work! Most authors, if not all, are from DeepMind Edmonton office.

cmauniada4y ago

I didn’t even know that they had an office in Edmonton...

bkartal4y ago

Edmonton is one of the best places for RL research & ecosystem, both DeepMind and University of Alberta are there.

1 more reply

SuoDuanDao4y ago· 1 in thread

Severian4y ago

ArtWomb4y ago· 1 in thread

This seems like a significant milestone in AI. I mean what can't an agent with mastery of "guided search, learning, and game-theoretic reasoning" accomplish?

ausbah4y ago

modeling every task as a game seems like a big hurdle, or even just getting a working "environment"

pixelpoet4y ago· 1 in thread

Anyone else surprised to see that Demis Hassabis didn't have a hand in this research? Given his background as a player of many games, and involvement in a lot of their research.

thomasahle4y ago

skinner_4y ago· 1 in thread

It would be awesome to have two interacting communities: AI experts building open source general game playing engines, and gaming fans writing pluggable rule specifications and UIs for popular games.

dpflug4y ago

Last I looked, the GGP community is focused on perfect information games currently. I had the same thought, though.

hervature4y ago

cab4044y ago

SCP-like name for SCP-like neural network.

"SCP-29123 Player Of Games"

wiz21c4y ago

Couldn't resist :

https://www.youtube.com/watch?v=-1F7vaNP9w0

antonpuz4y ago

Anyone knows whether the agent is publicly available?

simonebrunozzi4y ago