is not readily available, but has to be acquired for a cost, and there is a per-sample budget.
Inspired by real-world use-cases, we analyze average and hard variations of a directly
specified budget. We postulate the problem in its explicit formulation and then convert it into
an equivalent MDP, that can be solved with deep reinforcement learning. Also, we evaluate
a real-world inspired setting with sparse training datasets with missing features. The …