点云处理：实现PointNet点云分类

作者：Zhihao Cao
日期：2022.5
摘要：本示例在于演示如何基于 PaddlePaddle 2.3.0 实现PointNet在ShapeNet数据集上进行点云分类处理。

一、环境设置

本教程基于PaddlePaddle 2.3.0 编写，如果你的环境不是本版本，请先参考官网安装。

 import os import numpy as np import random import h5py import paddle import paddle.nn as nn import paddle.nn.functional as F print(paddle.__version__)  

2.3.0

二、数据集

2.1 数据介绍

ShapeNet数据集是一个注释丰富且规模较大的 3D 形状数据集，由斯坦福大学、普林斯顿大学和芝加哥丰田技术学院于 2015 年联合发布。
ShapeNet数据集官方链接：https://vision.princeton.edu/projects/2014/3DShapeNets/
AIStudio链接：sharpnet数据集(经过整理)
ShapeNet数据集的储存格式是h5文件，该文件中key值分别为：

1、data：这一份数据中所有点的xyz坐标，
2、label：这一份数据所属类别，如airplane等，
3、pid：这一份数据中所有点所属的类型，如这一份数据属airplane类，则它包含的所有点的类型有机翼、机身等类型。

2.2 解压数据集

!unzip data/data70460/shapenet_part_seg_hdf5_data.zip !mv hdf5_data dataset

2.3 数据列表

ShapeNet数据集所有的数据文件。

 train_list = [ "ply_data_train0.h5", "ply_data_train1.h5", "ply_data_train2.h5", "ply_data_train3.h5", "ply_data_train4.h5", "ply_data_train5.h5", ] test_list = ["ply_data_test0.h5", "ply_data_test1.h5"] val_list = ["ply_data_val0.h5"]  

2.4 搭建数据生成器

说明：将ShapeNet数据集全部读入。

 def make_data(mode="train", path="./dataset/", num_point=2048): datas = [] labels = [] if mode == "train": for file_list in train_list: f = h5py.File(os.path.join(path, file_list), "r") datas.extend(f["data"][:, :num_point, :]) labels.extend(f["label"]) f.close() elif mode == "test": for file_list in test_list: f = h5py.File(os.path.join(path, file_list), "r") datas.extend(f["data"][:, :num_point, :]) labels.extend(f["label"]) f.close() else: for file_list in val_list: f = h5py.File(os.path.join(path, file_list), "r") datas.extend(f["data"][:, :num_point, :]) labels.extend(f["label"]) f.close() return datas, labels  

说明：通过继承paddle.io.Dataset来完成数据集的构造。

 class PointDataset(paddle.io.Dataset): def __init__(self, datas, labels): super().__init__() self.datas = datas self.labels = labels def __getitem__(self, index): data = paddle.to_tensor(self.datas[index].T.astype("float32")) label = paddle.to_tensor(self.labels[index].astype("int64")) return data, label def __len__(self): return len(self.datas)  

说明：使用飞桨框架提供的API：paddle.io.DataLoader完成数据的加载，使得按照Batchsize生成Mini-batch的数据。

 # 数据导入 datas, labels = make_data(mode="train", num_point=2048) train_dataset = PointDataset(datas, labels) datas, labels = make_data(mode="val", num_point=2048) val_dataset = PointDataset(datas, labels) datas, labels = make_data(mode="test", num_point=2048) test_dataset = PointDataset(datas, labels) # 实例化数据读取器 train_loader = paddle.io.DataLoader( train_dataset, batch_size=128, shuffle=True, drop_last=False ) val_loader = paddle.io.DataLoader( val_dataset, batch_size=32, shuffle=False, drop_last=False ) test_loader = paddle.io.DataLoader( test_dataset, batch_size=128, shuffle=False, drop_last=False )  

三、定义网络

PointNet是斯坦福大学研究人员提出的一个点云处理网络，在这篇论文中，它提出了空间变换网络（T-Net）解决点云的旋转问题（注：因为考虑到某一物体的点云旋转后还是该物体，所以需要有一个网络结构去学习并解决这个旋转问题），并且提出了采取MaxPooling的方法极大程度上地提取点云全局特征。

3.1 定义网络结构

 class PointNet(nn.Layer): def __init__(self, name_scope="PointNet_", num_classes=16, num_point=2048): super().__init__() self.input_transform_net = nn.Sequential( nn.Conv1D(3, 64, 1), nn.BatchNorm(64), nn.ReLU(), nn.Conv1D(64, 128, 1), nn.BatchNorm(128), nn.ReLU(), nn.Conv1D(128, 1024, 1), nn.BatchNorm(1024), nn.ReLU(), nn.MaxPool1D(num_point), ) self.input_fc = nn.Sequential( nn.Linear(1024, 512), nn.ReLU(), nn.Linear(512, 256), nn.ReLU(), nn.Linear( 256, 9, weight_attr=paddle.ParamAttr( initializer=paddle.nn.initializer.Assign( paddle.zeros((256, 9)) ) ), bias_attr=paddle.ParamAttr( initializer=paddle.nn.initializer.Assign( paddle.reshape(paddle.eye(3), [-1]) ) ), ), ) self.mlp_1 = nn.Sequential( nn.Conv1D(3, 64, 1), nn.BatchNorm(64), nn.ReLU(), nn.Conv1D(64, 64, 1), nn.BatchNorm(64), nn.ReLU(), ) self.feature_transform_net = nn.Sequential( nn.Conv1D(64, 64, 1), nn.BatchNorm(64), nn.ReLU(), nn.Conv1D(64, 128, 1), nn.BatchNorm(128), nn.ReLU(), nn.Conv1D(128, 1024, 1), nn.BatchNorm(1024), nn.ReLU(), nn.MaxPool1D(num_point), ) self.feature_fc = nn.Sequential( nn.Linear(1024, 512), nn.ReLU(), nn.Linear(512, 256), nn.ReLU(), nn.Linear(256, 64 * 64), ) self.mlp_2 = nn.Sequential( nn.Conv1D(64, 64, 1), nn.BatchNorm(64), nn.ReLU(), nn.Conv1D(64, 128, 1), nn.BatchNorm(128), nn.ReLU(), nn.Conv1D(128, 1024, 1), nn.BatchNorm(1024), nn.ReLU(), ) self.fc = nn.Sequential( nn.Linear(1024, 512), nn.ReLU(), nn.Linear(512, 256), nn.ReLU(), nn.Dropout(p=0.7), nn.Linear(256, num_classes), nn.LogSoftmax(axis=-1), ) def forward(self, inputs): batchsize = inputs.shape[0] t_net = self.input_transform_net(inputs) t_net = paddle.squeeze(t_net, axis=-1) t_net = self.input_fc(t_net) t_net = paddle.reshape(t_net, [batchsize, 3, 3]) x = paddle.transpose(inputs, (0, 2, 1)) x = paddle.matmul(x, t_net) x = paddle.transpose(x, (0, 2, 1)) x = self.mlp_1(x) t_net = self.feature_transform_net(x) t_net = paddle.squeeze(t_net, axis=-1) t_net = self.feature_fc(t_net) t_net = paddle.reshape(t_net, [batchsize, 64, 64]) x = paddle.squeeze(x, axis=-1) x = paddle.transpose(x, (0, 2, 1)) x = paddle.matmul(x, t_net) x = paddle.transpose(x, (0, 2, 1)) x = self.mlp_2(x) x = paddle.max(x, axis=-1) x = paddle.squeeze(x, axis=-1) x = self.fc(x) return x  

3.2 网络结构可视化

说明：使用飞桨API：paddle.summary完成模型结构可视化

 pointnet = PointNet() paddle.summary(pointnet, (64, 3, 2048))  

 W0509 16:16:31.949033 135 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1 W0509 16:16:31.957976 135 device_context.cc:465] device: 0, cuDNN Version: 7.6. --------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param #  =========================================================================== Conv1D-1 [[64, 3, 2048]] [64, 64, 2048] 256 BatchNorm-1 [[64, 64, 2048]] [64, 64, 2048] 256 ReLU-1 [[64, 64, 2048]] [64, 64, 2048] 0 Conv1D-2 [[64, 64, 2048]] [64, 128, 2048] 8,320 BatchNorm-2 [[64, 128, 2048]] [64, 128, 2048] 512 ReLU-2 [[64, 128, 2048]] [64, 128, 2048] 0 Conv1D-3 [[64, 128, 2048]] [64, 1024, 2048] 132,096 BatchNorm-3 [[64, 1024, 2048]] [64, 1024, 2048] 4,096 ReLU-3 [[64, 1024, 2048]] [64, 1024, 2048] 0 MaxPool1D-1 [[64, 1024, 2048]] [64, 1024, 1] 0 Linear-1 [[64, 1024]] [64, 512] 524,800 ReLU-4 [[64, 512]] [64, 512] 0 Linear-2 [[64, 512]] [64, 256] 131,328 ReLU-5 [[64, 256]] [64, 256] 0 Linear-3 [[64, 256]] [64, 9] 2,313 Conv1D-4 [[64, 3, 2048]] [64, 64, 2048] 256 BatchNorm-4 [[64, 64, 2048]] [64, 64, 2048] 256 ReLU-6 [[64, 64, 2048]] [64, 64, 2048] 0 Conv1D-5 [[64, 64, 2048]] [64, 64, 2048] 4,160 BatchNorm-5 [[64, 64, 2048]] [64, 64, 2048] 256 ReLU-7 [[64, 64, 2048]] [64, 64, 2048] 0 Conv1D-6 [[64, 64, 2048]] [64, 64, 2048] 4,160 BatchNorm-6 [[64, 64, 2048]] [64, 64, 2048] 256 ReLU-8 [[64, 64, 2048]] [64, 64, 2048] 0 Conv1D-7 [[64, 64, 2048]] [64, 128, 2048] 8,320 BatchNorm-7 [[64, 128, 2048]] [64, 128, 2048] 512 ReLU-9 [[64, 128, 2048]] [64, 128, 2048] 0 Conv1D-8 [[64, 128, 2048]] [64, 1024, 2048] 132,096 BatchNorm-8 [[64, 1024, 2048]] [64, 1024, 2048] 4,096 ReLU-10 [[64, 1024, 2048]] [64, 1024, 2048] 0 MaxPool1D-2 [[64, 1024, 2048]] [64, 1024, 1] 0 Linear-4 [[64, 1024]] [64, 512] 524,800 ReLU-11 [[64, 512]] [64, 512] 0 Linear-5 [[64, 512]] [64, 256] 131,328 ReLU-12 [[64, 256]] [64, 256] 0 Linear-6 [[64, 256]] [64, 4096] 1,052,672 Conv1D-9 [[64, 64, 2048]] [64, 64, 2048] 4,160 BatchNorm-9 [[64, 64, 2048]] [64, 64, 2048] 256 ReLU-13 [[64, 64, 2048]] [64, 64, 2048] 0 Conv1D-10 [[64, 64, 2048]] [64, 128, 2048] 8,320 BatchNorm-10 [[64, 128, 2048]] [64, 128, 2048] 512 ReLU-14 [[64, 128, 2048]] [64, 128, 2048] 0 Conv1D-11 [[64, 128, 2048]] [64, 1024, 2048] 132,096 BatchNorm-11 [[64, 1024, 2048]] [64, 1024, 2048] 4,096 ReLU-15 [[64, 1024, 2048]] [64, 1024, 2048] 0 Linear-7 [[64, 1024]] [64, 512] 524,800 ReLU-16 [[64, 512]] [64, 512] 0 Linear-8 [[64, 512]] [64, 256] 131,328 ReLU-17 [[64, 256]] [64, 256] 0 Dropout-1 [[64, 256]] [64, 256] 0 Linear-9 [[64, 256]] [64, 16] 4,112 LogSoftmax-1 [[64, 16]] [64, 16] 0 =========================================================================== Total params: 3,476,825 Trainable params: 3,461,721 Non-trainable params: 15,104 --------------------------------------------------------------------------- Input size (MB): 1.50 Forward/backward pass size (MB): 11333.40 Params size (MB): 13.26 Estimated Total Size (MB): 11348.16 --------------------------------------------------------------------------- {'total_params': 3476825, 'trainable_params': 3461721}  

四、训练

说明：模型训练的时候，将会使用paddle.optimizer.Adam优化器来进行优化。使用F.nll_loss来计算损失值。

 def train(): model = PointNet(num_classes=16, num_point=2048) model.train() optim = paddle.optimizer.Adam( parameters=model.parameters(), weight_decay=0.001 ) epoch_num = 10 for epoch in range(epoch_num): # train print( "===================================train===========================================" ) for batch_id, data in enumerate(train_loader()): inputs, labels = data predicts = model(inputs) loss = F.nll_loss(predicts, labels) acc = paddle.metric.accuracy(predicts, labels) if batch_id % 20 == 0: print( "train: epoch: {}, batch_id: {}, loss is: {}, accuracy is: {}".format( epoch, batch_id, loss.numpy(), acc.numpy() ) ) loss.backward() optim.step() optim.clear_grad() if epoch % 2 == 0: paddle.save(model.state_dict(), "./model/PointNet.pdparams") paddle.save(optim.state_dict(), "./model/PointNet.pdopt") # validation print( "===================================val===========================================" ) model.eval() accuracies = [] losses = [] for batch_id, data in enumerate(val_loader()): inputs, labels = data predicts = model(inputs) loss = F.nll_loss(predicts, labels) acc = paddle.metric.accuracy(predicts, labels) losses.append(loss.numpy()) accuracies.append(acc.numpy()) avg_acc, avg_loss = np.mean(accuracies), np.mean(losses) print( "validation: loss is: {}, accuracy is: {}".format(avg_loss, avg_acc) ) model.train() if __name__ == "__main__": train()  

 ===================================train=========================================== train: epoch: 0, batch_id: 0, loss is: [8.135595], accuracy is: [0.046875] train: epoch: 0, batch_id: 20, loss is: [0.96110815], accuracy is: [0.7265625] train: epoch: 0, batch_id: 40, loss is: [0.77762437], accuracy is: [0.8046875] train: epoch: 0, batch_id: 60, loss is: [0.575164], accuracy is: [0.84375] train: epoch: 0, batch_id: 80, loss is: [0.60243726], accuracy is: [0.8359375] ===================================val=========================================== validation: loss is: 0.5027859807014465, accuracy is: 0.848895251750946 ===================================train=========================================== train: epoch: 1, batch_id: 0, loss is: [0.5886416], accuracy is: [0.8359375] train: epoch: 1, batch_id: 20, loss is: [0.59509534], accuracy is: [0.8515625] train: epoch: 1, batch_id: 40, loss is: [0.43501458], accuracy is: [0.875] train: epoch: 1, batch_id: 60, loss is: [0.5497817], accuracy is: [0.8515625] train: epoch: 1, batch_id: 80, loss is: [0.2889481], accuracy is: [0.8984375] ===================================val=========================================== validation: loss is: 0.2470872551202774, accuracy is: 0.9263771176338196 ===================================train=========================================== train: epoch: 2, batch_id: 0, loss is: [0.43095332], accuracy is: [0.8984375] train: epoch: 2, batch_id: 20, loss is: [0.42620662], accuracy is: [0.8984375] train: epoch: 2, batch_id: 40, loss is: [0.31073096], accuracy is: [0.8984375] train: epoch: 2, batch_id: 60, loss is: [0.21410619], accuracy is: [0.9375] train: epoch: 2, batch_id: 80, loss is: [0.23696409], accuracy is: [0.9296875] ===================================val=========================================== validation: loss is: 0.24663102626800537, accuracy is: 0.9278147220611572 ===================================train=========================================== train: epoch: 3, batch_id: 0, loss is: [0.1000444], accuracy is: [0.96875] train: epoch: 3, batch_id: 20, loss is: [0.2845613], accuracy is: [0.9296875] train: epoch: 3, batch_id: 40, loss is: [0.46592], accuracy is: [0.859375] train: epoch: 3, batch_id: 60, loss is: [0.3819336], accuracy is: [0.9140625] train: epoch: 3, batch_id: 80, loss is: [0.08518291], accuracy is: [0.9765625] ===================================val=========================================== validation: loss is: 0.17066480219364166, accuracy is: 0.9491525292396545 ===================================train=========================================== train: epoch: 4, batch_id: 0, loss is: [0.11713062], accuracy is: [0.9609375] train: epoch: 4, batch_id: 20, loss is: [0.1716559], accuracy is: [0.953125] train: epoch: 4, batch_id: 40, loss is: [0.15082854], accuracy is: [0.96875] train: epoch: 4, batch_id: 60, loss is: [0.2787561], accuracy is: [0.96875] train: epoch: 4, batch_id: 80, loss is: [0.11986132], accuracy is: [0.9609375] ===================================val=========================================== validation: loss is: 0.1389710158109665, accuracy is: 0.9608050584793091 ===================================train=========================================== train: epoch: 5, batch_id: 0, loss is: [0.17427993], accuracy is: [0.9453125] train: epoch: 5, batch_id: 20, loss is: [0.25355965], accuracy is: [0.9609375] train: epoch: 5, batch_id: 40, loss is: [0.18881711], accuracy is: [0.9609375] train: epoch: 5, batch_id: 60, loss is: [0.14433464], accuracy is: [0.953125] train: epoch: 5, batch_id: 80, loss is: [0.13028377], accuracy is: [0.96875] ===================================val=========================================== validation: loss is: 0.09753856807947159, accuracy is: 0.9671609997749329 ===================================train=========================================== train: epoch: 6, batch_id: 0, loss is: [0.12662013], accuracy is: [0.9765625] train: epoch: 6, batch_id: 20, loss is: [0.1309431], accuracy is: [0.9609375] train: epoch: 6, batch_id: 40, loss is: [0.29988244], accuracy is: [0.9453125] train: epoch: 6, batch_id: 60, loss is: [0.114668], accuracy is: [0.9609375] train: epoch: 6, batch_id: 80, loss is: [0.48784435], accuracy is: [0.9296875] ===================================val=========================================== validation: loss is: 0.16411711275577545, accuracy is: 0.9576271176338196 ===================================train=========================================== train: epoch: 7, batch_id: 0, loss is: [0.12558301], accuracy is: [0.9609375] train: epoch: 7, batch_id: 20, loss is: [0.1776012], accuracy is: [0.953125] train: epoch: 7, batch_id: 40, loss is: [0.12831621], accuracy is: [0.9609375] train: epoch: 7, batch_id: 60, loss is: [0.15245995], accuracy is: [0.953125] train: epoch: 7, batch_id: 80, loss is: [0.08825297], accuracy is: [0.9609375] ===================================val=========================================== validation: loss is: 0.06742173433303833, accuracy is: 0.9809321761131287 ===================================train=========================================== train: epoch: 8, batch_id: 0, loss is: [0.07868354], accuracy is: [0.96875] train: epoch: 8, batch_id: 20, loss is: [0.1875119], accuracy is: [0.96875] train: epoch: 8, batch_id: 40, loss is: [0.04444], accuracy is: [0.9921875] train: epoch: 8, batch_id: 60, loss is: [0.08977574], accuracy is: [0.9765625] train: epoch: 8, batch_id: 80, loss is: [0.13062863], accuracy is: [0.9765625] ===================================val=========================================== validation: loss is: 0.13399624824523926, accuracy is: 0.9661017060279846 ===================================train=========================================== train: epoch: 9, batch_id: 0, loss is: [0.14676869], accuracy is: [0.953125] train: epoch: 9, batch_id: 20, loss is: [0.16409941], accuracy is: [0.9609375] train: epoch: 9, batch_id: 40, loss is: [0.08795467], accuracy is: [0.96875] train: epoch: 9, batch_id: 60, loss is: [0.05970801], accuracy is: [0.984375] train: epoch: 9, batch_id: 80, loss is: [0.2631768], accuracy is: [0.9296875] ===================================val=========================================== validation: loss is: 0.11335306614637375, accuracy is: 0.9682203531265259  

五、评估与测试

说明：通过model.load_dict的方式加载训练好的模型对测试集上的数据进行评估与测试。

 def evaluation(): model = PointNet() model_state_dict = paddle.load("./model/PointNet.pdparams") model.load_dict(model_state_dict) model.eval() accuracies = [] losses = [] for batch_id, data in enumerate(test_loader()): inputs, labels = data predicts = model(inputs) loss = F.nll_loss(predicts, labels) acc = paddle.metric.accuracy(predicts, labels) losses.append(loss.numpy()) accuracies.append(acc.numpy()) avg_acc, avg_loss = np.mean(accuracies), np.mean(losses) print("validation: loss is: {}, accuracy is: {}".format(avg_loss, avg_acc)) if __name__ == "__main__": evaluation()