Docking path prediction, Policy network, Protein complex structure prediction, Reinforcement learning