In this paper, we propose a new cascade structure framework for age estimation to reduce the influence of intrinsic factors (ie demographic information) on age estimation. As shown in Figure 1, the displays varied for different populations. Therefore, we explore the guidance information to guide the learning of the proposed cascade frameworks. Here, we examine five cascade structure frameworks: Gender2AgeNet, Race2AgeNet, Age2AgeNet, GenderRace2AgeNet, and RaceGender2AgeNet, as shown in Fig. 2. To evaluate the proposed frameworks, we use two networks: 1) a popular deep network (VGG-16 ) and 2) a shallow network (see Figure 3) proposed in this paper.
The main contribution of this article is summarized below.
The rest of this article is as follows. The related works are discussed in Section II. The proposed algorithm is introduced in Section III. Then, extensive experiments are presented in Section 4 to evaluate and compare our method with the advanced one. Section V presents a discussion of the proposed method. Finally, the conclusions are presented in Section VI.
Estimates of human age from facial images have been studied for over 20 years. In general terms, these works , , ,  comprise two steps: 1) feature extraction (displaying features from facial images) and 2) regression or classification (predicting age by Extracted Features).
To extract the feature, some works used geometric features to classify age into three groups (ie, child, young, or adult). Popular geometric features ,  included chin drop, skin wrinkle, nose drop or mustache.
Although geometrical features can distinguish between child and adult, they cannot deal with adults and the elderly. Later, the most representative feature, the Biologically Inspired Attribute (BIF), was proposed by Guo et al. , widely used by many works ,  to estimate age. They used smaller size Gabor filters and suggested that the number of bands and orientations be determined in a specific way for the problem .
However, while the BIF feature is carefully and manually designed, we explore the integration of automatic feature extraction and regression (or classification) based on the proposed cascade structure frameworks.
Then the next step is to reach the age estimate. Age estimation can usually be treated as a classification or regression problem. Kwon and Lobo  classified facial images as age group. But in the laboratory dataset there were only 47 images, and the correct accuracy for the child group was below 68%. Work in  uses five classifiers to predict the age group with the majority decision law. The final accuracy can be 74% for three groups: 1) 0-15% 2) 15-30% and 3) above 30.