Reconfigurable antennas (RAs) arised as a promising antenna technology which can adapt to channel variations and enhance wireless link capacity. To fully take advantage of RA's benefits, optimal antenna states need to be selected on-the-fly. However the channel statistics are unknown a priori. Multi-armed bandit (MAB) algorithms have been adopted to cope with this challenge, however the main drawback of existing approaches is that their regret scales linearly with the number of candidate antenna states and converges slowly with time. In this paper, we propose a novel Hierarchical Thompson Sampling (HTS) algorithm. HTS divides the arms into multiple clusters, first uses TS to sample a cluster and then samples an individual arm inside that cluster. Then we apply HTS to anntena state selection, and propose a K-means based antenna state clustering strategy by exploiting antenna radiation pattern correlation. Simulation results using a real-world RA's radiation patterns show that our HTS algorithm can substantially improve the convergence rate and enjoys much lower expected regret than existing schemes, especially for a large number of antenna states.