3D-LiDAR-based cooperative perception has been generating significant interest for its ability to tackle challenges such as occlusion, sparse point clouds, and out-of-range issues that can be problematic for single-vehicle perception. Despite its effectiveness in overcoming various challenges, cooperative perception's performance can still be affected by the aforementioned issues when Connected Automated Vehicles (CAVs) operate at the edges of their sensing range. Our proposed approach called HYDRO-3D aims to improve object detection performance by explicitly incorporating historical object tracking information. Specifically, HYDRO-3D combines object detection features from a state-of-the-art object detection algorithm (V2X-ViT) with historical information from the object tracking algorithm to infer objects. Afterward, a novel spatial-temporal 3D neural network performing global and local manipulations of object-tracking historical data is applied to generate the feature map to enhance object detection. The proposed HYDRO-3D method is comprehensively evaluated on the state-of-the-art V2XSet. The qualitative and quantitative experiment results demonstrate that the HYDRO-3D can effectively utilize the object tracking information and achieve robust object detection performance. It outperforms the SOTA V2X-ViT by 3.7% in AP@0.7 of object detection for CAVs and can also be generalized to single-vehicle object detection with 4.5% improvement in AP@0.7.