This paper aims a developing a new feedback guidance algorithm for docking maneuvers in the cislunar environment. In particular, the goal is to create an algorithm that is lightweight, closed-loop and capable of taking path constraints into account. The problem has been solved starting from the well know Zero-Effort-Miss/Zero-Effort-Velocity (ZEM/ZEV) guidance using machine learning to improve its capabilities and widen its field of application. The algorithm has been developed in the circular restricted three body problem (CRTBP) framework for Near Rectilinear Orbits (NRO) in the Earth-Moon system but the results can be easily generalized to many more guidance problems. The results are satisfactory and show that reinforcement learning can be effectively used to solve constrained relative spacecraft guidance problems.