Softmax Strategy
1. epsilon-greedy strategy 11111 2. UCB strategy 222 3. Softmax strategy 333 4. Gradient strategy 444 References
7月前510
1. epsilon-greedy strategy 11111 2. UCB strategy 222 3. Softmax strategy 333 4. Gradient strategy 444 References
0x0. 背景 随着年纪越来越大,读代码越来越困难,如果你发现看不懂同事写的代码应该怎么办呢?不要担心,大语言模型的时代了来了,
