In this article, we consider a network with a hybrid access point (HAP) and radio frequency (RF)-energy harvesting wireless devices. The HAP is responsible for charging these devices and receiving their data. Our problem is to select a set of devices to transmit in each time slot so as to maximize a given reward over a planning horizon. In contrast to prior works, we consider the challenging case whereby the HAP has imperfect channel state information (CSI) nor information about the battery state of devices. We also consider nonlinear RF-energy conversion rates and battery leakage. We propose a cross entropy approach to identify the best set of devices to select in each time slot over random channel gains. In addition, we propose a fast Gibbs sampling approach, called Gibbs+, that incorporates a novel step to evict noncompetitive devices. We compare our solutions against random pick, round robin, original Gibbs sampling, and perfect information selection (PIS). Our results show that CE