Csaba's book is the most up-to-date on RL I know of. Sutton and Barto is very old by now. For the POMDP side of things there are no recent books I know of, but
http://www.cs.mcgill.ca/~jpineau/files/sross-jair08.pdf is a recent enough survey.
A related book is "Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems" which you can find at http://www.princeton.edu/~sbubeck/index.html
The bandit problem is very strongly related to the reinforcement learning problem, so you'll get some mileage out of studying bandits. Be aware this area is very maths heavy, which is good or bad depending on your background. If you like you like this stuff, also checkout "Prediction, Learning, and Games" which deals more with the "adversarial" setup.