Bandit learning with implicit feedback