In this paper, we propose a sampling-based policy iteration for optimal planning under temporal logic constraints. The method integrates approximate optimal control, importance sampling, and formal methods. For a subclass of linear temporal logic, the planning problem is transformed to an optimal control problem for a hybrid system where discrete transitions are triggered by linear time events in temporal logic. Instead of solving the Hamilton-Jacobi-Bellman equation, we use policy function approximation to reduce the problem into a search of an optimal weight vector that parametrizes the near-optimal policy for given bases. Then, we incorporate Model Reference Adaptive Search - an importance sampling-based optimization algorithm to perform a sample-efficient search within the parameter space of policy function approximations. Facing the discontinuity in cost function introduced by temporal logic constraints and system dynamics, we introduce (1) a rank function in formal logic specifications to enable sample-efficient search; (2) specification-guided basis selection. Under mild technical assumptions, the proposed algorithm converges, with probability one, to a global approximate optimal policy that ensures the satisfaction of temporal logic constraints. The correctness and efficiency of the method are demonstrated through numerical experiments including temporal logic planning for a linear system and a nonlinear mobile robot.