TensorFlow on Android（4）: 输入数据预处理和Inference-白红宇

TensorFlow on Android（4）: 输入数据预处理和Inference

阅读量：4148 次

发布时间：2019-05-25

本文共 4321 字，大约阅读时间需要 14 分钟。

Graph，Op， Tensor

在开始输入数据之前，我们先简单讲一下TensorFlow中的一些概念

一个 TensorFlow 的计算任务，叫做Graph，一个Graph由很多节点（Op）组成， Op通过Tensor获取输入，Op完成计算以后再通过Tensor把输出传递到下一个节点。

Tensor一般来说是一个数组（1维或多维），我们用Feed操作将一个Tensor的数据输入到一个Op，用Fetch操作将Op的输出提取到Tensor当中

那么回到我们的项目中来，为了进行物体的识别，我们要做就是相应的Op中输入图片数据，然后从相应的Op提取识别结果

我们在可以找到我们使用的模型的大致架构

那么可以这样说，我们向名为“image_tensor”的Op中输入图片数据，然后从多个Op中获取识别结果，这样的Op有4个

“detection_boxes”，输出识别物体的矩形框

“detection_scores”，输出识别物体的Confidence，用来衡量识别得准确性有多大

“detection_classes”，识别物体的种类

“num_detections”，识别物体的数量

明白了我们该怎么做以后，我们开始写代码吧！

设计界面

首先我们需要设计一个界面，有一个Button用来呼出相册，以及一个ImageView来显示图片和识别结果，大致是这样的

然后我们需要写一些代码来实现从相册从提取图片，并显示在ImageView上面

因为这些代码都很简单，而且很多现成代码可以参考，这里就不再详述了。

数据预处理

在将图片数据Feed到我们的模型之前，我们还需要对图片数据进行一些处理，将它转换为我们模型能够识别和处理的数据。我们用Bitmap作为最原始的输入数据，我们需要做下面的一些处理：

第一步是图片尺寸，每一个图片的长宽都是不一样的，但是我们希望把所有的输入图片都处理成一样的尺寸，比如说300X300，所以我们需要写一些图片尺寸转换的代码，这样的代码如果自己不会写，在网上也可以找到很多现成的代码，下面也是从开源的代码里面参考来的转换函数：

public class Utils {public static Matrix getImageTransformationMatrix(        final int srcWidth,        final int srcHeight,        final int dstWidth,        final int dstHeight,        final int applyRotation,        final boolean maintainAspectRatio) {    final Matrix matrix = new Matrix();    if (applyRotation != 0) {        matrix.postTranslate(-srcWidth / 2.0f, -srcHeight / 2.0f);        matrix.postRotate(applyRotation);    }    final boolean transpose = (Math.abs(applyRotation) + 90) % 180 == 0;    final int inWidth = transpose ? srcHeight : srcWidth;    final int inHeight = transpose ? srcWidth : srcHeight;    if (inWidth != dstWidth || inHeight != dstHeight) {        final float scaleFactorX = dstWidth / (float) inWidth;        final float scaleFactorY = dstHeight / (float) inHeight;        if (maintainAspectRatio) {            final float scaleFactor = Math.max(scaleFactorX, scaleFactorY);            matrix.postScale(scaleFactor, scaleFactor);        } else {            matrix.postScale(scaleFactorX, scaleFactorY);        }    }    if (applyRotation != 0) {        matrix.postTranslate(dstWidth / 2.0f, dstHeight / 2.0f);    }    return matrix; }}

这个函数会返回进行图片尺寸转换所需要的Matrix对象，这很有用，我们在后面可视化识别结果的时候用的着。然后我们通过下面的代码来完成转换

Bitmap bitmapInput = Bitmap.createBitmap(300, 300, Bitmap.Config.ARGB_8888);final Matrix originToInput = Utils.getImageTransformationMatrix(                        originImage.getWidth(), originImage.getHeight(), 300, 300,                        0, false);final Canvas canvas = new Canvas(bitmapInput);canvas.drawBitmap(originImage, originToInput, null);

第二步是将二维的位图数据转换为一维的数组，我们的模型接受的输入是由图片的像素点RGB值组成的一维数组，比如说有2个像素点（用（R,G,B）表示），（1，2，3），（4，5，6），那么正确的输入数组应该是[1, 2, 3, 4, 5, 6], 我们可以通过下面的代码来完成

int[] pixels = new int[300 * 300];bitmapInput.getPixels(pixels, 0, bitmapInput.getWidth(), 0, 0, bitmapInput.getWidth(), bitmapInput.getHeight());byte[] byteInput = new byte[pixels.length * 3];for (int i = 0; i < pixels.length; ++i) {     byteInput[i * 3 + 2] = (byte) (pixels[i] & 0xFF);     byteInput[i * 3 + 1] = (byte) ((pixels[i] >> 8) & 0xFF);     byteInput[i * 3 + 0] = (byte) ((pixels[i] >> 16) & 0xFF);    }

我们先通过getPixels获取位图所有像素的一维数组，再通过位操作分别提取每个像素的RGB值，然后赋值到byteInput数组的相应位置中， byteInput数组就是处理好的，准备Inference的数据了

Inference

我们通过调用TensorFlowInferenceInterface的Feed方法来向模型中输入数据

inferenceInterface.feed("image_tensor", byteInput, 1, 300, 300, 3);

这个代码的意思是向名为“image_tensor”的Op输入相应的数据: 图片数据,值为 byteInput; batch_size, 我们输入的是一张图片的数据，所以值为1；图片的高和宽，都是300；通道数，因为我们使用RGB，所以值为3

接下来我们分配一些数组（Tensor）来准备接受Inference的结果，我们只取物体位置，分数，物体类型的数据，在这里我们最多取前100个识别结果：

float[] boxes = new float[100 * 4];float[] scores = new float[100];float[] classes = new float[100];

因为一个矩形框（box）是由（top，left， botton，right）的4元组表示的，所以boxes数组的大小应该是 100X4。分配好数组以后，我们就可以开始inference, 并从相应的Op里面提取识别结果，代码如下：

inferenceInterface.run(new String[]{"detection_boxes", "detection_scores","detection_classes"}, false);float[] boxes = new float[MAX_RESULTS * 4];float[] scores = new float[MAX_RESULTS];float[] classes = new float[MAX_RESULTS];inferenceInterface.fetch("detection_boxes", boxes);inferenceInterface.fetch("detection_scores", scores);inferenceInterface.fetch("detection_classes", classes);

我们使用TensorFlowInferenceInterface的Run方法来启动从之前用Feed注册的输入节点（image_tensor）到由参数指定的输出节点（detection_boxes, detection_scores，detection_classes）的Inference，对我们来说，就是从输入的图片数据，识别出物体的位置，类别，和分数。然后我们用 Fetch方法来提取相应的输出数据。

现在我们已经得到了识别结果，接下来准备把结果可视化吧！

转载地址：http://dsiti.baihongyu.com/

你可能感兴趣的文章