The risk prediction in the software development is mandatory for it to be recognized, categorized and prioritized earlier for the success of the project. The requirement gathering stage is the most important and challenging stage of the Software Development Life Cycle (SDLC). The risks should be tackled at this stage and saved it to be used in future projects. The software requirement risks can be predicted using classification techniques of data-mining at requirement gathering stage. A dataset is required containing the attributes of software requirements and risks for the prediction of risks in the new software requirements. In this paper, a risk dataset is proposed which contains requirements from the Software Requirement Specification (SRS) of different open source projects and the risk attributes from literature and IT experts. The research comprised of three main phases that include risk-oriented data collection, dataset validation by IT experts, and dataset validation and filtration