Autonomous close proximity operations (including hovering and landing) in the lowgravity environment exhibited by asteroids are particularly challenging. Current approaches to this problem require knowledge of the environmental dynamics in the asteroid's vicinity. This knowledge is costly, both in terms of time and money, to acquire. This paper uses reinforcement learning (RL) to develop a novel non-linear hovering controller with sufficient robustness to allow precision hovering in unknown environments, limited only by the maximum thrust requirements imposed by the environment. We demonstrate the robustness of the controller by simulating precision hovering in multiple environments that were unknown during the policy optimization. The environments are modeled using non-uniform rotation and a non-uniform gravity field. Simulations were also run using a shape model of the asteroid Itokawa. Performance is compared to that of an RL derived optimal linear PD controller and an LQR controller. Since the hovering controller requires an estimate of the spacecraft's state relative to a landmark on the asteroid's surface, we also introduce an optical seeker based navigation approach that accurately estimates the spacecraft's current state using only a single camera and laser range finder.