Group Relative Policy Optimization