In recent years, video communication has established its dominance in the communication world as it has become an integral part of our everyday life ranging from handheld device videos to broadcasted video news (from unstructured to highly structured). It is required to formalize video semantics for making users able to refine the relevant information and get what they need. A framework for dynamic intra video indexing is proposed and designed for human focused video. The term "Human Focused" is used to focus the human who is the most interesting and important feature in every video. So identification of certain features which are relevant to humans and indexing the video based on human gender and age is the main aim of study. The main purpose is identification of human faces, classifying videos with respect to age and gender, and displaying the results of current completed experiments. The appropriate set of features will lead to identification of similar video provided with a description. The focus is towards human and object interaction and working towards algorithms to explore these interactions by using available information, the indexing of the visual scene as a whole, while the main focus would be on human gender and age. Evaluation will be done by comparing machine generated indexes with human detected ones. The proposed research aim to make a relevant input into the growing area of multiple modalities to extract key features with in a video and propose a novel system for video indexing.