Gait recognition is one of technology for biometrics at a distance that can be used to identify a human through walking postures and body shape. In the field of information forensics and security, gait recognition is exploited for crime prevention, forensic identification, and social security. However, the existing gait recognition methods usually consider the appearance, posture and temporal information separately, thus, we focus on considering a learned temporal attention mechanism to fuse these features in global and local manner. In this article, a novel gait recognition framework Temporal Attention and Keypoint-guided Embedding (GaitTAKE) is proposed, which effectively fuses temporal-attention-based global and local appearance feature and temporal aggregated human pose feature. Experimental results show that our proposed method achieves superior performance in gait recognition with rank-1 accuracy of 98.0% (normal), 97.5% (bag) and 92.2% (coat) on the CASIA-B gait dataset; 90.4% accuracy on the OU-MVLP gait dataset; 51.34% accuracy on the GREW dataset; 53.1% accuracy on the Gait3D gait dataset.