I'm no expert, but there are approaches to reinforcement learning where you build a table of the expected payoff for each scenario, and these can recursively reference other entries in the table. Take a look at Q-learning, for example.
Dynamic programming is one of the most widely-used algorithms in reinforcement learning and in fact the tic-tac-toe example using dynamic programming is a classic demo in reinforcement learning classes or textbooks.
Dynamic programming is a recursive algorithm that caches partial results (from the bottom up).
Apples and oranges