The enhanced Mobile Broadband (eMBB) and ultra-Reliable Low Latency Communications (URLLC) are two main scenarios of 5G networks. There is an obvious difference in service requirements between the two different scenarios. When multiple heterogeneous services coexist in the network, it is important to explore optimal resource scheduling and allocation strategies. In this paper, we study the Quality of Service (QoS) optimization problem in eMBB and URLLC coexisting scenario. Considering the services' characteristics of heterogeneity and dynamics, we first introduce the flexible numerology structure which defines a set of flexible transmission time interval (TTI) to satisfy different QoS requirements of heterogeneous services, and then, we formulate a Markov decision process (MDP)-based dynamic resource optimization problem with the flexibilities of time and frequency domains. Next, we prove this optimization problem to be NP-hard and propose an innovative joint scheduling strategy DRSA based on flexible numerology and deep reinforcement learning method to allocate dynamic resources. Through experiments, the flexible numerology significantly outperforms the non-flexible ones. Comparison experiments with Sequence, Greedy and Random strategies show that the average throughput of DRSA is 7.1%, 14.8% and 23.9% higher than them, and DRSA can reduce URLLC services' loss rate by 43.7%, 28.6% and 53.8%.