The details of our method submitted to the task 4 of DCASE challenge 2018 are described in this technical report. This task evaluates systems for the detection of sound events in domestic environments using large-scale weakly labeled data. In particular, an architecture based on the framework of convolutional recurrent neural network (CRNN) is utilized to detect the timestamps of all the events in given audio clips where the training audio files have only clip-level labels. In order to take advantage of the large-scale unlabeled in-domain training data, a deep residual network based model (ResNeXt) is first employed to make predictions for weak labels of the unlabeled data. In addition, a mixup technique is applied in model training process, which is believed to have some benefits on the data augmentation and the model generalization capability. Finally, the system achieves 22.05% F1-value in classwise average metrics for the sound event detection on the provided testing dataset.