Embodied cognition theory states that students thinking in a learning environment is embodied in physical activity. In this regard, recent research has shown that signal-level handwriting dynamics can distinguish learning performance. Although machine learning has been considered to detect how multimodal modalities correlate to specific learning processes, the use of deep learning has received insufficient attention. With this in mind, we build a Group Work Performance Prediction system from analysis of 3D (including strokes frequency) handwriting signals of students in a smart English classroom, with deep convolutional neuronal network (CNN) based regression models. For labelling of their proficiency level, their spoken language performance is being used. The students were working together in groups. A 3D (2D writing coordinates plus frequency) handwriting dataset (3D-Writing-DB) was collected through a collaboration platform known as ‘creative digital space’. We extracted the 3D handwriting signal from a table tablet during English discussion sessions. Afterwards, professional English teachers annotated the English speech (values vary from 0 - 5). Our experimental results indicate that group work performance can be successfully predicted from physical handwriting features by using deep learning, as shown by our best result, i. e., 0.32 in regression assessment by applying RMSE for evaluation.