走进Kaggle的未知领域：性别和年龄推断算法解析

1、环境设置：

此环节将加载实现笔记本无缝功能的基本模块，包括NumPy、Pandas和TensorFlow等库。此外，它还建立了关键的环境常数，如图像尺寸和学习率，这对后续分析和模型训练至关重要。



# General

import os

import keras

import numpy as np

import pandas as pd

import tensorflow as tf

# Data

import plotly.express as px

import matplotlib.pyplot as plt

# Data Preprocessing

import tensorflow.data as tfds

from sklearn.model_selection import train_test_split

# Model

from keras.applications import VGG16

from keras.applications import Xception, InceptionV3

from keras.applications import ResNet50V2, ResNet152V2

from keras.applications import MobileNetV3Small, MobileNetV3Large

# Model training

from keras import Sequential

from keras.layers import Dense

from keras.layers import Dropout

from keras.layers import Flatten, GlobalAveragePooling2D

from keras.layers import InputLayer

# Model Callbacks

from keras.callbacks import EarlyStopping

from keras.callbacks import ModelCheckpoint

# Setting constants for reproducibility

np.random.seed(42)

tf.random.set_seed(42)

# Constants

BATCH_SIZE = 32

IMAGE_SIZE = 224

2、加载和处理UTKFace数据集

在Kaggle上发现的UTKFace数据集是一个全面的面部图像集合，专门用于年龄和性别识别任务。它包括2万多张人脸图像，并附有年龄、性别和种族的注释。这些图像是多样化的，代表了不同的种族、年龄和性别，使其成为机器学习和计算机视觉研究的广泛而有价值的资源。

UTKFace数据集中的每张图像都标有人的年龄，范围从0到116岁，以及他们的性别，分类为男性或女性。

此外，数据集还包括有关个人种族的信息，允许进行更细致的分析和应用。

研究人员和开发人员经常利用该数据集来训练和测试面部识别技术中年龄估计、性别分类和其他相关任务的算法。它的大尺寸和多样化的面部特征表示使其成为探索和开发计算机视觉领域模型的热门选择。

# Initialize the directory path

dir_path = "/kaggle/input/utkface-new/UTKFace/"

image_paths = os.listdir(dir_path)

# Initialize a Gender Mapping

gender_mapping = ["Male", "Female"]

# Choose and load an image randomly

rand_image_path = np.random.choice(image_paths)

rand_image = plt.imread(dir_path + rand_image_path)/255.

sample_age, sample_gender, *_ = rand_image_path.split("_")

print(f"Total number of images  : {len(image_paths)}")

print(f"Sample Image path       : {rand_image_path}")

print(f"Sample Age              : {sample_age}")

print(f"Sample Gender           : {gender_mapping[int(sample_gender)]}\n")

# Show the image

plt.figure(figsize = (5,5))

plt.title("Sample Image")

plt.imshow(rand_image)

plt.axis("off")

plt.show()

结果:

Total number of images : 23708

Sample Image path : 1_0_3_20161220222642427.jpg.chip.jpg

Sample Age : 1

Sample Gender : Male

处理23,708张图像需要使用诸如批处理或使用图像生成器之类的内存高效策略来避免压倒性的内存限制。此外，描绘年龄和性别的图像路径结构(第一部分表示年龄，第二部分表示性别(0表示男性，1表示女性))为后续分析和分类提供了重要信息。

仔细管理这些路径将有助于根据年龄和性别属性进行有针对性的数据处理。

# Initialize a male counter variable

male_count = 0

# Initialize variable to store all the ages.

ages = []

# Loop over the paths and check for male images.

for path in image_paths:

    path_split = path.split("_")

    if "0" == path_split[1]:

        male_count += 1

    ages.append(int(path_split[0]))

# Computee total female counts

female_count = len(image_paths) - male_count

# Visualizing The Class Imbalance

pie_chart = px.pie(

    names = gender_mapping,

    values = [male_count, female_count],

    hole = 0.4,

    title = "Gender Distribution (Donut Chart)",

    height = 500

)

pie_chart.show()

bar_graph = px.bar(

    y = gender_mapping,

    x = [male_count, female_count],

    title = "Gender Distribution (Bar Graph)",

    color = gender_mapping,

    height = 500

)

bar_graph.update_layout(

    yaxis_title = "Gender",

    xaxis_title = "Frequency Count"

)

bar_graph.show()

我们的数据集似乎显示出轻微的类别不平衡，男性图像的数量高于女性图像——大约52%的男性和48%的女性代表。虽然这种不平衡不足以显著影响模型的准确性，但值得注意的是，要全面了解数据集对男性图像的偏见。管理这种轻微的偏差可以提高模型的稳健性和预测的公平性，尽管它目前对准确性的影响可能有限。

# Histogram

fig = px.histogram(sorted(ages), title = "Age Distribution")

fig.update_layout(

    xaxis_title = "Age",

    yaxis_title = "Value Counts"

)

fig.show()

# Violin Plot

fig = px.violin(x = sorted(ages), title = "Age Distribution")

fig.update_layout(

    xaxis_title = "Age",

    yaxis_title = "Distribution"

)

fig.show()

# Box Plot

fig = px.box(x = sorted(ages), notched=True, title = "Age Distribution")

fig.update_layout(

    xaxis_title = "Age",

)

fig.show()

直方图分析在我们的数据集分布中显示了两个突出的峰值。第一个簇包含年龄从0到12岁左右的人，而第二个簇包含16到45岁左右的人。

有趣的是，这些峰值表现出不同的强度，后一个峰值明显更强。具体来说，我们在1岁左右观察到大约1123张图像的峰值，而在26岁时观察到大约2197张图像的峰值。

值得注意的是，虽然我们的数据集确实包含了一些80岁以上的老年人的图像，但这些事件构成了较小的峰值。总的来说，这种分布表明了对年轻群体的潜在偏见，强调了数据集强调的是描绘早期和成年期个体的图像，而不是老年。

从小提琴图和箱形图中获得的见解证实了我们从直方图中得出的初步观察结果。数据明显分为两个子集:一个是非常小的孩子的图像，证实了直方图在1岁左右的峰值，另一个是26到45岁之间的成年人的大量集中。小提琴的情节特别强调这种双峰分布。

在仔细检查箱形图后，向左侧的明显挤压与我们在直方图中观察到的相呼应。这种压缩源于较年长年龄范围内的数值分布拉长和相对稀疏。它重申了我们的理解，即数据集显示出对年轻人的轻微倾斜。

这种固有的偏见可能确实会影响我们模型的准确性，因为它倾向于更有效地从大量的年轻人口统计数据中学习和概括，而可能忽略了老年群体中普遍存在的模式或细微差别。解决这一问题对于确保我们的模式在不同年龄段的公平和全面的学习至关重要。

# SHuffling the Images

np.random.shuffle(image_paths)

# Split data into training, testing and validation set

train_images, test_images = train_test_split(

    image_paths,

    train_size = 0.9,

    test_size = 0.1

)

train_images, valid_images = train_test_split(

    image_paths,

    train_size = 0.9,

    test_size = 0.1

)

print(f"Training Size : {len(train_images)}")

print(f"Testing Size  : {len(test_images)}")

# Extract age and gender

train_ages = [int(path.split("_")[0]) for path in train_images]

train_genders = [int(path.split("_")[1]) for path in train_images]

valid_ages = [int(path.split("_")[0]) for path in valid_images]

valid_genders = [int(path.split("_")[1]) for path in valid_images]

test_ages = [int(path.split("_")[0]) for path in test_images]

test_genders = [int(path.split("_")[1]) for path in test_images]

Training Size : 21337

Testing Size  : 2371

由于我们有大量的图像，因此使用Tensorflow数据集进行高效处理会更好。

def show_image(image, show=False):

    """

    Displays the provided image without axis.

    Args:

    - image (array-like): The image data to be displayed.

    - show (bool): If True, displays the image immediately. Defaults to False.

    Returns:

    - None

    """

    plt.imshow(image)

    plt.axis("off")

    if show:

        plt.show()

def preprocess_age_data(image_path, age, gender, dir_path=dir_path, IMAGE_SIZE = IMAGE_SIZE):

    """

    Preprocesses an image for analysis by extracting age and gender from the image path,

    loading and decoding the image, resizing it to (IMAGE_SIZE,IMAGE_SIZE), normalizing pixel values,

    and returning the preprocessed image along with age and gender labels.

    Args:

    - image_path (str): The path to the image file.

    - dir_path (str): The directory path where the image is located. Defaults to `dir_path`.

    Returns:

    - tuple: A tuple containing the preprocessed image as a TensorFlow tensor,

             the age (int), and the gender (int) extracted from the image path.

    """

    # Load the Image

    image = tf.io.read_file(dir_path + image_path)

    image = tf.io.decode_jpeg(image)

    # Resize and Normalize the Image

    image = tf.image.resize(image, (IMAGE_SIZE, IMAGE_SIZE))

    image = image / 255.

    image = tf.cast(image, tf.float32)

    return image, age

def preprocess_gender_data(image_path, age, gender, dir_path=dir_path, IMAGE_SIZE = IMAGE_SIZE):

    """

    Preprocesses an image for analysis by extracting age and gender from the image path,

    loading and decoding the image, resizing it to (IMAGE_SIZE,IMAGE_SIZE), normalizing pixel values,

    and returning the preprocessed image along with age and gender labels.

    Args:

    - image_path (str): The path to the image file.

    - dir_path (str): The directory path where the image is located. Defaults to `dir_path`.

    Returns:

    - tuple: A tuple containing the preprocessed image as a TensorFlow tensor,

             the age (int), and the gender (int) extracted from the image path.

    """

    # Load the Image

    image = tf.io.read_file(dir_path + image_path)

    image = tf.io.decode_jpeg(image)

    # Resize and Normalize the Image

    image = tf.image.resize(image, (IMAGE_SIZE, IMAGE_SIZE))

    image = image / 255.

    image = tf.cast(image, tf.float32)

    return image, gender

# Obtain training, testing and validation datasets

train_ds = tfds.Dataset.from_tensor_slices((train_images, train_ages, train_genders)).shuffle(2000)

train_age_ds = train_ds.map(preprocess_age_data, num_parallel_calls=BATCH_SIZE).batch(BATCH_SIZE).prefetch(tfds.AUTOTUNE)

train_gender_ds = train_ds.map(preprocess_gender_data, num_parallel_calls=BATCH_SIZE).batch(BATCH_SIZE).prefetch(tfds.AUTOTUNE)

valid_ds = tfds.Dataset.from_tensor_slices((valid_images, valid_ages, valid_genders)).shuffle(2000)

valid_age_ds = valid_ds.map(preprocess_age_data, num_parallel_calls=BATCH_SIZE).batch(BATCH_SIZE).prefetch(tfds.AUTOTUNE)

valid_gender_ds = valid_ds.map(preprocess_gender_data, num_parallel_calls=BATCH_SIZE).batch(BATCH_SIZE).prefetch(tfds.AUTOTUNE)

test_ds  = tfds.Dataset.from_tensor_slices((test_images, test_ages, test_genders)).shuffle(500)

test_age_ds = test_ds.map(preprocess_age_data, num_parallel_calls=BATCH_SIZE).batch(BATCH_SIZE).prefetch(tfds.AUTOTUNE)

test_gender_ds = test_ds.map(preprocess_gender_data, num_parallel_calls=BATCH_SIZE).batch(BATCH_SIZE).prefetch(tfds.AUTOTUNE)

plt.figure(figsize=(15, 10))

for images, ages, genders in train_ds.batch(BATCH_SIZE).take(1):

    for index in range(len(images)):

        image = tf.io.read_file(dir_path + images[index])

        image = tf.io.decode_jpeg(image)

        plt.subplot(4, 8, index + 1)

        plt.imshow(image)

        plt.title(f"Age: {ages[index]}\nGender: {gender_mapping[genders[index]]}")

        plt.axis("off")

plt.tight_layout()

plt.show()

3、骨干架构比较

当我们的目标是为最终模型建立一个最优的架构时，我们正在探索各种候选模型来作为基础架构。这个过程包括评估多个主干，以全面衡量它们的性能。

通过进行比较分析，我们试图辨别每个骨干架构的优点和缺点。我们的目标是确定最合适的骨干体系结构，它将作为我们最终模型的健壮基础。随后，将采用微调对选定的骨干进行细化和定制，确保开发出优化有效的模型。

# Initializing all the Backbones

backbones = [

    (

        "VGG16",

        VGG16(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

    (

        "ResNet50V2",

        ResNet50V2(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

    (

        "ResNet152V2",

        ResNet152V2(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

    (

        "Xception",

        Xception(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

    (

        "InceptionV3",

        InceptionV3(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

    (

        "MobileNetV3Small",

        MobileNetV3Small(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

    (

        "MobileNetV3Large",

        MobileNetV3Large(

            input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3),

            weights = "imagenet",

            include_top = False

        )

    ),

为了找到最好的骨干架构，我们首先需要训练一个相当小的网络进行几次迭代，然后我们将看到哪个骨干架构将更快更好地带来鲁棒性能。

请注意

在进行了几次实验之后，很明显，为年龄和性别预测建立一个单一的模型并不能产生最佳结果。尽管在预测年龄方面取得了显著的成绩，但性别预测的准确性并不理想。因此，更有效的方法是采用两种不同的模型来预测年龄和性别。

在这个修改后的策略中，模型的主干保持不变。然而，重点是在年龄预测的基础上确定表现最好的骨干。由于准确的年龄预测涉及到回归，这是一个更复杂的任务，因此预期在年龄预测中表现出色的骨干在性别预测中也会表现良好。

通过采用不同的年龄和性别预测模型，我们可以根据每个任务提出的独特挑战来定制架构和训练策略。这种专门的方法允许对每个模型进行独立的优化，从而提高整体性能。这种任务的解耦承认了年龄和性别预测的微妙本质，并允许对模型进行微调，以解决与每种预测类型相关的具体挑战。

BACKBONE_HISTORIES = {}

for (name, backbone) in backbones:

    print(f"Testing : {name}")

    # Freeze the Model weights

    backbone.trainable = False

    # Creating a base model

    model = keras.Sequential([

        InputLayer((IMAGE_SIZE, IMAGE_SIZE, 3), name = "InputLayer"),

        backbone,

        Dropout(0.2, name = "SlightDropout"),

        Flatten(name = "FlattenEmbeddings"),

        Dense(1, name = "Age")

    ])

    # Train the model for few iterations

    model.compile(

        loss = ["mae"],

        optimizer = "adam",

        weighted_metrics=[]

    )

    history = model.fit(

        train_age_ds,

        validation_data = valid_age_ds,

        epochs = 5,

        batch_size = BATCH_SIZE

    )

    BACKBONE_HISTORIES[name] = pd.DataFrame(history.history)

    cls()

    print("\n")

metric = "loss"

plt.figure(figsize=(15, 5))

for i, sub in enumerate(['Train', 'Val']):

    plt.subplot(1, 2, i+1)

    plt.title(f"{sub} {metric} Plot")

    for name, history in BACKBONE_HISTORIES.items():

        plt.plot(history[metric] if sub=="Train" else history[f"val_{metric}"], label = name)

        plt.xlabel("Epochs")

        plt.ylabel(metric.title())

        plt.legend()

        plt.grid()

plt.tight_layout()

plt.show()

在仔细检查年龄模型学习曲线后，可以明确排除某些骨干，特别是与MobileNet相关的骨干。他们一贯的糟糕表现使他们不适合做候选人。

VGG16是最有希望的主干，在训练和验证损失方面都表现出稳定的下降，收敛在平均绝对误差8.0左右。这种一致性表明了鲁棒性，使VGG16成为一个有利的选择。

虽然其他一些模型，如Inception和exception，产生较低的训练损失，但它们在训练和验证曲线之间的相当大的差距表明潜在的不稳定性。考虑到速度和健壮性之间的权衡，盗梦空间作为一个值得称赞的选择脱颖而出。

其平滑的增长轨迹和相对的可预测性使其成为一个合适的折衷方案，特别是与异常的更不稳定的性能相比时。这种细致入微的评估指导了选择过程，确保选择的主干与所需的模型特征保持一致。

BACKBONE_HISTORIES = {}

for (name, backbone) in backbones:

    print(f"Testing : {name}")

    # Freeze the Model weights

    backbone.trainable = False

    # Creating a base model

    model = keras.Sequential([

        InputLayer((IMAGE_SIZE, IMAGE_SIZE, 3), name = "InputLayer"),

        backbone,

        Dropout(0.2, name = "SlightDropout"),

        Flatten(name = "FlattenEmbeddings"),

        Dense(1, activation="sigmoid", name = "Age")

    ])

    # Train the model for few iterations

    model.compile(

        loss = ["binary_crossentropy"],

        optimizer = "adam",

        metrics = ['accuracy'],

        weighted_metrics=[]

    )

    history = model.fit(

        train_gender_ds,

        validation_data = valid_gender_ds,

        epochs = 5,

        batch_size = BATCH_SIZE

    )

    BACKBONE_HISTORIES[name] = pd.DataFrame(history.history)

    cls()

    print("\n")

metric = "accuracy"

plt.figure(figsize=(15, 5))

for i, sub in enumerate(['Train', 'Val']):

    plt.subplot(1, 2, i+1)

    plt.title(f"{sub} {metric} Plot")

    for name, history in BACKBONE_HISTORIES.items():

        plt.plot(history[metric] if sub=="Train" else history[f"val_{metric}"], label = name)

        plt.xlabel("Epochs")

        plt.ylabel(metric.title())

        plt.legend()

        plt.grid()

plt.tight_layout()

plt.show()

在评估性别预测准确性的模型时，一个值得注意的消除包括MobileNet模型，它一直表现出较差的性能。

深入研究训练和验证精度曲线揭示了一个杰出的表演者:ResNet152V2模型，由独特的绿色曲线表示。该模型达到了最高的训练精度和验证精度，将其定位为一个引人注目的选择。

然而，必须解决所有模型的训练和验证精度之间的巨大差异。这强调了在最终模型创建期间需要更复杂的基线模型，合并多个层以更好地处理这些差异。

尽管考虑到这一点，ResNet152V2仍然保持着突出的地位，拥有出色的训练和验证性能。值得注意的是，ResNet50V2在训练性能上与ResNet152V2非常接近，尽管在验证精度上有所欠缺。

4、VGG16 Age Network

经过深入分析，VGG16网络因其在最小化训练和验证损失方面的一致性能而成为最佳骨干网。这导致决定使用VGG16作为年龄模型的共享主干。

年龄模型侧重于准确预测年龄，性别模型侧重于精确预测性别。结合这两个模型，我们创建了一个统一的网络，用于同时预测年龄和性别，利用VGG16的优势进行有效的面部属性检测。

# Loading backbone

vgg_16 = VGG16(input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3), include_top = False, weights = "imagenet")

# Freezing the backbone weights

vgg_16.trainable = False

# Creating a Age Network

age_net = Sequential([

    InputLayer((IMAGE_SIZE, IMAGE_SIZE, 3), name="ImageInput"),

    vgg_16,

    Dropout(0.4, name = "SlightDroput"),

    Flatten(name="FlattenEmbeddings"),

    Dense(256, activation="relu", kernel_initializer="he_normal"),

    Dense(1, name="AgeOutput")

], name="AgeNet")

# Compiling Model

age_net.compile(

    loss = "mae",

    optimizer = 'adam',

    weighted_metrics=[]

)

# Trining the Age Model

age_history = age_net.fit(

    train_age_ds,

    validation_data = valid_age_ds,

    epochs = 20,

    batch_size = BATCH_SIZE,

    callbacks = [

        EarlyStopping(

            patience = 5,

            monitor = "val_loss",

            restore_best_weights = True

        ),

        ModelCheckpoint(

            "Age-VGG16.keras",

            save_best_only=True

        )

    ]

)

# Converting history into data frame.

age_his_df = pd.DataFrame(age_history.history)

# Visualization

age_his_df.plot()

plt.title("Age Network Learning Curve")

plt.xlabel("Epochs")

plt.ylabel("Mean Abs. Error")

plt.grid()

plt.show()

值得注意的是，我们的年龄预测模型的性能超过了基线模型。该模型的平均绝对误差，最初徘徊在8左右，已经成功地减少到大约6.7，这表明了实质性的进展。

虽然可以通过参数调优和细致的架构调整来实现进一步的改进，但在我们的方法范围内，当前的损失代表了值得赞扬的结果。

此外，早期停止的结合证明有利于防止过拟合。在第12和第13个时代左右，训练损失和验证损失之间出现分歧的迹象，表明可能存在过拟合。

早期停止机制，加上模型检查点，通过恢复到性能最好的模型版本，有效地减轻了这种担忧，这明显发生在具有最小差值的第7 epoch附近。

5、ResNet152V2 性别网络

# Initializing the backbone layer

resnet = ResNet152V2(input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3), include_top = False, weights="imagenet")

# Freeze the backbone weights

resnet.trainable = False

# Creating a gender model

gender_net = Sequential([

    InputLayer((IMAGE_SIZE, IMAGE_SIZE, 3), name="ImageInput"),

    resnet,

    Dropout(0.2, name = "SlightDroput"),

    GlobalAveragePooling2D(name="GlobalAvgPooling"),

    Dense(1, activation="sigmoid", name="gender")

], name="GenderNet")

gender_net.compile(

    loss = "binary_crossentropy",

    optimizer = 'adam',

    metrics = ['accuracy'],

    weighted_metrics=[]

)

# Training the Gender Model

gender_history = gender_net.fit(

    train_gender_ds,

    validation_data = valid_gender_ds,

    epochs = 20,

    batch_size = BATCH_SIZE,

    callbacks = [

        EarlyStopping(

            patience = 5,

            monitor = "val_accuracy",

            restore_best_weights = True

        ),

        ModelCheckpoint(

            "Gender-ResNet152.keras",

            save_best_only=True

        )

    ]

)

# Converting history into data frame.

gender_his_df = pd.DataFrame(gender_history.history)

# Visualization

plt.figure(figsize=(10, 5 ))

plt.subplot(1,2,1)

plt.suptitle("Gender Network Learning Curve")

plt.plot(gender_his_df['loss'], label="Loss")

plt.plot(gender_his_df['val_loss'], label="Val. Loss")

plt.xlabel("Epochs")

plt.ylabel("$Loss$")

plt.grid()

plt.legend()

plt.subplot(1,2,2)

plt.plot(gender_his_df['accuracy'], label="Accuracy")

plt.plot(gender_his_df['val_accuracy'], label="Val. Accuracy")

plt.xlabel("Epochs")

plt.ylabel("$Accuracy$")

plt.grid()

plt.legend()

plt.show()

已对性别网络的结构作出重大调整，以增强其稳健性。通过去除flatten层和Dense层，加入GlobalAveragePooling层，解决了模型的过拟合倾向。

初始配置显示超过90%的训练精度，但验证精度明显下降到85%左右，表明易受过拟合的影响。

修正后的模型虽然精度略低，但成功地保持了高度的鲁棒性。损失差异很小，验证损失约为0.30，训练损失约为0.26。这种边际差异反映在准确性上，其中训练准确性为88%，验证准确性为87%，仅产生1%的差异-就模型稳定性而言，这是一个值得称赞的成就。

6、模型评价

在这个关键时刻，我们完成了核心部分--模型训练和数据处理，达到了一个关键的里程碑。下一个关键步骤是评估模型在测试数据集上的性能。通过评估，我们可以深入了解模型在现实世界中的预期表现。测试阶段的结果是评估模型在未见数据上的泛化和预测能力的重要基准。

# Loading saved models

age_net = tf.keras.models.load_model("/kaggle/input/age-and-gender-prediction-vgg16-resnet152v2/Weights/Age-VGG16.keras", compile=True)

gender_net = tf.keras.models.load_model("/kaggle/input/age-and-gender-prediction-vgg16-resnet152v2/Weights/Gender-ResNet152.keras", compile=True)

# Evaluating test performance

age_net.evaluate(test_age_ds)

gender_net.evaluate(test_gender_ds)

在测试数据上对我们的最终模型进行的结论性评估显示，与我们只关注主干比较的初步版本相比，我们的性能有了显著提高。这表明我们在改进和优化模型方面的成功努力，导致整体性能的大幅提高。精细调整和参数调整的反复过程显然取得了成效，表明该模型在预测年龄和性别方面的有效性和可靠性有所提高。

7、模型预测

可视化模型在测试样本中的表现为我们的评估增添了一个引人入胜、富有洞察力的维度。通过图形表示，我们可以更直观地了解模型在不同实例中的表现。通过这种可视化的探索，我们可以发现一些模式、趋势或潜在的改进领域，而这些在数字指标中可能并不明显。让我们深入研究可视化表示，揭示模型在测试数据上表现的细微差别。

plt.figure(figsize=(15, 10))

for images, ages, genders in test_ds.batch(BATCH_SIZE).take(1):

    for index in range(len(images)):

        # Image Processing

        image = tf.io.read_file(dir_path + images[index])

        image = tf.io.decode_jpeg(image)

        image = tf.cast(image, tf.float32)

        image = image/255.

        image = tf.image.resize(image, [224,224])

        image = tf.reshape(image, [-1,224,224,3])

        # Predictions

        pred_age = tf.round(tf.squeeze(age_net.predict(image, verbose=0)), 2)

        pred_gender = tf.round(tf.squeeze(gender_net.predict(image, verbose=0)))

        # Visualization

        plt.subplot(4, 8, index + 1)

        plt.imshow(image[0])

        plt.title(f"Age: {int(pred_age)}\nGender: {gender_mapping[int(pred_gender)]}")

        plt.axis("off")

plt.tight_layout()

plt.show()

观察模型的预测，我对它的整体表现有很高的信心。不考虑婴儿图像，该模型在准确识别性别和为大多数个体提供精确的年龄估计方面始终表现出色。这一成就值得注意，反映了模型对年龄和性别预测相关任务的熟练程度。

然而，数据集中婴儿图像的存在带来了一定程度的怀疑和不安。对我来说，预测婴儿的年龄和性别的概念有点异想天开，类似于小说中的概念。虽然该模型在这一领域取得了一定程度的成功，但我对这种预测的合理性仍持谨慎态度。

走进Kaggle的未知领域：性别和年龄推断算法解析的更多相关文章

Excel通过身份证获取出生年月，性别，年龄，生肖，星座，省份等信息总结归纳
Excel通过身份证获取出生年月,性别,年龄,生肖,星座,省份等信息总结归纳早期的身份证号码为15位数字,现在使用的身份证号码为18位数字,它们的含义如下:(1)15位:1-6位为地区代码,7-8位 ...
js脚本根据身份证号获取性别、年龄、家庭地址、生日
做项目测试时需要根据身份证号获取其信息,也不想调接口,就自己在本地通过收集资料整合了一个
java 根据身份证号码获取出生日期、性别、年龄
1.情景展示如何根据身份证号,计算出出生日期.性别.年龄? 2.解决方案从网上找的别人的,因为并没有实际用到,所以并未对其优化! /** * 通过身份证号码获取出生日期.性别.年龄 * @pa ...
Java 身份证判断性别获取年龄
import com.alibaba.fastjson.JSON; import org.junit.Test; import java.text.SimpleDateFormat; import j ...
CardUtil算出当前身份证持有者的性别和年龄
import java.text.SimpleDateFormat; import java.util.Date; import java.util.HashMap; import java.util ...
JS控制语句编程练习学生数据，分别是姓名、性别、年龄和年级，接下来呢，我们要利用JavaScript的知识挑出其中所有是大一的女生的的名字哦。
编程练习在一个大学的编程选修课班里,我们得到了一组参加该班级的学生数据,分别是姓名.性别.年龄和年级,接下来呢,我们要利用JavaScript的知识挑出其中所有是大一的女生的的名字哦. 学生信息如下 ...
JS-身份证号获取出生日期、性别、年龄
var cardId=$("#cardId").val();//先获取身份证号(据自己实际写法获取) 1.获取出生日期: function getBirth(cardId){ va ...
js 根据身份证号获取性别，年龄，等
$(function(){ $("#corpOwnerIdno").blur(function(){ //获取输入身份证号码 ...
ADO.NET 【属性扩展】性别、年龄、编号
1.性别属性扩展 private bool _Sex; //成员变量 public bool Sex // 属性 { get { return _Sex; } set { _Sex = value; ...
通过身份证分析出生年月日、性别、年龄的SQL语句
),) ) ),)<>'X' ) ) )<>'X' ),)),)),)) ),)),)),)) ) as int)) where [出生日期]<>'' #字符串格式 ...

随机推荐

[转帖]--build=arm-linux
今天在arm上用configure生成makefile时报错:configure: error: cannot guess build type; you must specify one 问题: 不 ...
[转帖]nacos discovery和config
微服务和nacos版本都在2.x及之后. 1.discovery用于服务注册,将想要注册的服务注册到nacos中,被naocs发现. pom引入的依赖是: yml配置文件中: 2.config用于获取 ...
ubuntu18.04 安装wine以及添加mono和gecko打开简单.net应用的方法
1. 今天突然想试试能不能用ubuntu跑一下公司的.net的智能客户端(SmartClient). 想到的办法就是安装wine 但是过程略坑..这里简单说一下总结之后的过程. 2. 第一步安装wi ...
银河麒麟安装LLDB的方法以及调试 dump 文件 (未完成)
今天同事要进行 lldb进行调试dotnet的bug 本来在x86 上面进行相应的处理但是发现报错. 没办法正好有一台借来的arm服务器就搞了一下. 简单记录一下安装方法 1. 安装 apt的so ...
Grafana 监控 PG数据库的操作过程
Grafana 监控 PG数据库的操作过程容器化运行 postgres-exporter 进行处理 1. 镜像运行 exporter docker run -p 9187:9187 -e DATA_ ...
Hutool中那些常用的工具类和实用方法
背景灵魂拷问1:还在为新项目工具类搬迁而烦恼? 灵魂拷问2:还在为项目中工具类维护而烦恼? 简述 **Hutool**它是一个Java工具集类库,包含了很多静态方法的封装:流处理.时间日期处理.正则 ...
【杂题，树】【Uoj】Uoj618 【JOISC2021】聚会 2
2023.7.3 Problem Link 给定一棵 $n$ 个点的树,对于一个点集 $S$,定义 $f(u,S)$ 为 \(\min_u \sum_{v\in S} \mathrm{di ...
【JS 逆向百例】网洛者反爬练习平台第六题：JS 加密，环境模拟检测
关注微信公众号:K哥爬虫,持续分享爬虫进阶.JS/安卓逆向等技术干货! 声明本文章中所有内容仅供学习交流,抓包内容.敏感网址.数据接口均已做脱敏处理,严禁用于商业用途和非法用途,否则由此产生的一切后 ...
Python 实现 WebSocket 通信
WebSocket 协议主要用于解决Web前端与后台数据交互问题,在WebSocket技术没有被定义之前,前台与后端通信需要使用轮询的方式实现,WebSocket则是通过握手机制让客户端与服务端建立全 ...
模式识别实验：基于主成分分析（PCA）的人脸识别
前言本文使用Python实现了PCA算法,并使用ORL人脸数据集进行了测试并输出特征脸,简单实现了人脸识别的功能. 1. 准备 ORL人脸数据集共包含40个不同人的400张图像,是在1992年4月至 ...

走进Kaggle的未知领域：性别和年龄推断算法解析

走进Kaggle的未知领域：性别和年龄推断算法解析的更多相关文章

随机推荐

热门专题