当前位置：首页 > 编程日记 > 正文

如何在Tensorflow.js中处理MNIST图像数据

编程日记 2024-08-11 09:50:00

by Kevin Scott

凯文·斯科特(Kevin Scott)

如何在Tensorflow.js中处理MNIST图像数据 (How to deal with MNIST image data in Tensorflow.js)

There’s the joke that 80 percent of data science is cleaning the data and 20 percent is complaining about cleaning the data … data cleaning is a much higher proportion of data science than an outsider would expect. Actually training models is typically a relatively small proportion (less than 10 percent) of what a machine learner or data scientist does.
有人开玩笑说，80％的数据科学正在清理数据，20％的人们抱怨清理数据……数据清理在数据科学中所占的比例比外界预期的要高得多。实际上，训练模型通常只占机器学习者或数据科学家所做工作的一小部分(不到10％)。
There’s the joke that 80 percent of data science is cleaning the data and 20 percent is complaining about cleaning the data … data cleaning is a much higher proportion of data science than an outsider would expect. Actually training models is typically a relatively small proportion (less than 10 percent) of what a machine learner or data scientist does.
有人开玩笑说，80％的数据科学正在清理数据，20％的人们抱怨清理数据……数据清理比外部人期望的要高得多。实际上，训练模型通常只占机器学习者或数据科学家所做工作的一小部分(不到10％)。
— Anthony Goldbloom, CEO of Kaggle
— Kaggle首席执行官Anthony Goldbloom

Manipulating data is a crucial step for any machine learning problem. This article will take the MNIST example for Tensorflow.js (0.11.1), and walk through the code that handles the data loading line-by-line.

对于任何机器学习问题，处理数据都是至关重要的一步。本文将以Tensorflow.js(0.11.1)的MNIST示例为例，并逐行介绍处理数据加载的代码。

MNIST示例 (MNIST example)

18 import * as tf from '@tensorflow/tfjs';1920 const IMAGE_SIZE = 784;21 const NUM_CLASSES = 10;22 const NUM_DATASET_ELEMENTS = 65000;2324 const NUM_TRAIN_ELEMENTS = 55000;25 const NUM_TEST_ELEMENTS = NUM_DATASET_ELEMENTS - NUM_TRAIN_ELEMENTS;2627 const MNIST_IMAGES_SPRITE_PATH =28     'https://storage.googleapis.com/learnjs-data/model-builder/mnist_images.png';29 const MNIST_LABELS_PATH =30     'https://storage.googleapis.com/learnjs-data/model-builder/mnist_labels_uint8';`

First, the code imports Tensorflow (make sure you’re transpiling your code!), and establishes some constants, including:

首先，代码导入Tensorflow (确保您正在编译代码！) ，并建立一些常量，包括：

IMAGE_SIZE – the size of an image (width and height of 28x28 = 784)
IMAGE_SIZE –图片大小(宽度和高度28x28 = 784)
NUM_CLASSES – number of label categories (a number can be 0-9, so there's 10 classes)
NUM_CLASSES –标签类别的数量(一个数字可以是0-9，因此有10个类别)
NUM_DATASET_ELEMENTS – number of images total (65,000)
NUM_DATASET_ELEMENTS –图像总数(65,000)
NUM_TRAIN_ELEMENTS – number of training images (55,000)
NUM_TRAIN_ELEMENTS –训练图像数(55,000)
NUM_TEST_ELEMENTS – number of test images (10,000, aka the remainder)
NUM_TEST_ELEMENTS –测试图像的数量(10,000，也称为余数)
MNIST_IMAGES_SPRITE_PATH & MNIST_LABELS_PATH – paths to the images and the labels
MNIST_IMAGES_SPRITE_PATH和MNIST_LABELS_PATH –图像和标签的路径

The images are concatenated into one huge image which looks like:

这些图像被串联成一个巨大的图像，看起来像：

`MNISTData` (`MNISTData`)

Next up, starting on line 38, is MnistData, a class that exposes the following functions:

接下来，从第38行开始是MnistData ，该类提供以下功能：

load – responsible for asynchronously loading the image and labeling data
load –负责异步加载图像和标签数据
nextTrainBatch – load the next training batch
nextTrainBatch加载下一个训练批次
nextTestBatch – load the next test batch
nextTestBatch –加载下一个测试批次
nextBatch – a generic function to return the next batch, depending on whether it is in the training set or test set
nextBatch –返回下一批的通用函数，具体取决于它在训练集中还是在测试集中

For the purposes of getting started, this article will only go through the load function.

为了入门，本文将仅介绍load函数。

`load` (`load`)

44 async load() {45   // Make a request for the MNIST sprited image.46   const img = new Image();47   const canvas = document.createElement('canvas');48   const ctx = canvas.getContext('2d');

async is a relatively new language feature in Javascript for which you will need a transpiler.

async 是Javascript中相对较新的语言功能，您需要使用该功能。

The Image object is a native DOM function that represents an image in memory. It provides callbacks for when the image is loaded along, with access to the image attributes. canvas is another DOM element that provides easy access to pixel arrays and processing by way of context.

Image对象是本机DOM函数，表示内存中的图像。它提供了在加载图像时的回调以及对图像属性的访问。 canvas是另一个DOM元素，可以通过context轻松访问像素数组和进行处理。

Since both of these are DOM elements, if you’re working in Node.js (or a Web Worker) you won’t have access to these elements. For an alternative approach, see below.

由于这两个都是DOM元素，因此，如果您在Node.js(或Web Worker)中工作，则将无法访问这些元素。有关替代方法，请参见下文。

`imgRequest` (`imgRequest`)

49 const imgRequest = new Promise((resolve, reject) => {50   img.crossOrigin = '';51   img.onload = () => {52     img.width = img.naturalWidth;53     img.height = img.naturalHeight;

The code initializes a new promise that will be resolved once the image is loaded successfully. This example does not explicitly handle the error state.

该代码初始化一个新的Promise，一旦成功加载图像，该Promise将被解决。 本示例未明确处理错误状态。

crossOrigin is an img attribute that allows for the loading of images across domains, and gets around CORS (cross-origin resource sharing) issues when interacting with the DOM. naturalWidth and naturalHeight refer to the original dimensions of the loaded image, and serve to enforce that the image's size is correct when performing calculations.

crossOrigin是一个img属性，它允许跨域加载图像，并且在与DOM交互时crossOrigin了CORS(跨域资源共享)问题。 naturalWidth和naturalHeight是指加载的图像的原始尺寸，用于在执行计算时强制图像的大小正确。

55     const datasetBytesBuffer =56     new ArrayBuffer(NUM_DATASET_ELEMENTS * IMAGE_SIZE * 4);5758     const chunkSize = 5000;59     canvas.width = img.width;60     canvas.height = chunkSize;

The code initializes a new buffer to contain every pixel of every image. It multiplies the total number of images by the size of each image by the number of channels (4).

该代码初始化一个新缓冲区，以包含每个图像的每个像素。它将图像总数乘以每个图像的大小乘以通道数(4)。

I believe that chunkSize is used to prevent the UI from loading too much data into memory at once, though I'm not 100% sure.

我相信， chunkSize用于阻止加载太多的数据UI到内存中一次，虽然我不是100％肯定。

62     for (let i = 0; i < NUM_DATASET_ELEMENTS / chunkSize; i++) {63       const datasetBytesView = new Float32Array(64         datasetBytesBuffer, i * IMAGE_SIZE * chunkSize * 4,65         IMAGE_SIZE * chunkSize);66       ctx.drawImage(67         img, 0, i * chunkSize, img.width, chunkSize, 0, 0, img.width,68         chunkSize);6970       const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);

This code loops through every image in the sprite and initializes a new TypedArray for that iteration. Then, the context image gets a chunk of the image drawn. Finally, that drawn image is turned into image data using context's getImageData function, which returns an object representing the underlying pixel data.

此代码循环遍历子画面中的每个图像，并为该迭代初始化一个新的TypedArray 。然后，上下文图像将获得绘制图像的一部分。最后，使用上下文的getImageData函数将该绘制的图像转换为图像数据，该函数返回一个表示基础像素数据的对象。

72       for (let j = 0; j < imageData.data.length / 4; j++) {73         // All channels hold an equal value since the image is grayscale, so74         // just read the red channel.75         datasetBytesView[j] = imageData.data[j * 4] / 255;76       }77     }

We loop through the pixels, and divide by 255 (the maximum possible value of a pixel) to clamp the values between 0 and 1. Only the red channel is necessary, since it’s a grayscale image.

我们遍历像素，然后除以255(像素的最大可能值)以将值限制在0和1之间。由于红色通道是灰度图像，因此仅需要红色通道。

78     this.datasetImages = new Float32Array(datasetBytesBuffer);7980     resolve();81   };82   img.src = MNIST_IMAGES_SPRITE_PATH;83 });

This line takes the buffer, recasts it into a new TypedArray that holds our pixel data, and then resolves the Promise. The last line (setting the src) actually begins loading the image, which starts the function.

这行代码将缓冲区，将其TypedArray到容纳我们的像素数据的新TypedArray中，然后解析Promise。最后一行(设置src )实际上开始加载图像，从而启动功能。

One thing that confused me at first was the behavior of TypedArray in relation to its underlying data buffer. You might notice that datasetBytesView is set within the loop, but is never returned.

一开始让我感到困惑的是TypedArray与其底层数据缓冲区有关的行为。您可能会注意到， datasetBytesView是在循环内设置的，但是从不返回。

Under the hood, datasetBytesView is referencing the buffer datasetBytesBuffer (with which it is initialized). When the code updates the pixel data, it is indirectly editing the values of the buffer itself, which in turn is recast into a new Float32Array on line 78.

在datasetBytesView ， datasetBytesView引用了缓冲区datasetBytesBuffer (用于对其进行初始化)。当代码更新像素数据时，它正在间接编辑缓冲区本身的值，然后将其Float32Array到第78行的新Float32Array 。

在DOM之外获取图像数据 (Fetching image data outside of the DOM)

If you’re in the DOM, you should use the DOM. The browser (through canvas) takes care of figuring out the format of images and translating buffer data into pixels. But if you're working outside the DOM (say, in Node.js, or a Web Worker), you'll need an alternative approach.

如果您在DOM中，则应使用DOM。浏览器(通过canvas )负责确定图像的格式并将缓冲区数据转换为像素。但是，如果您在DOM之外工作(例如，在Node.js或Web Worker中)，则需要另一种方法。

fetch provides a mechanism, response.arrayBuffer, which gives you access to a file's underlying buffer. We can use this to read the bytes manually, avoiding the DOM entirely. Here's an alternative approach to writing the above code (this code requires fetch, which can be polyfilled in Node with something like isomorphic-fetch):

fetch提供了一种机制response.arrayBuffer ，使您可以访问文件的基础缓冲区。我们可以使用它来手动读取字节，从而完全避免使用DOM。这是编写以上代码的另一种方法(此代码需要fetch ，可以将它用isomorphic-fetch类的东西填充到Node中)：

const imgRequest = fetch(MNIST_IMAGES_SPRITE_PATH).then(resp => resp.arrayBuffer()).then(buffer => {  return new Promise(resolve => {    const reader = new PNGReader(buffer);    return reader.parse((err, png) => {      const pixels = Float32Array.from(png.pixels).map(pixel => {        return pixel / 255;      });      this.datasetImages = pixels;      resolve();    });  });});

This returns an array buffer for the particular image. When writing this, I first attempted to parse the incoming buffer myself, which I wouldn’t recommend. (If you are interested in doing that, here’s some information on how to read an array buffer for a png.) Instead, I elected to use pngjs, which handles the png parsing for you. When dealing with other image formats, you'll have to figure out the parsing functions yourself.

这将返回特定图像的数组缓冲区。在编写此代码时，我首先尝试自己解析传入的缓冲区，我不建议这样做。 (如果您对此感兴趣，这里有一些有关如何读取png数组缓冲区的信息。)相反，我选择使用pngjs ，它为您处理png解析。处理其他图像格式时，您必须自己弄清楚解析函数。

只是划伤表面 (Just scratching the surface)

Understanding data manipulation is a crucial component of machine learning in JavaScript. By understanding our use cases and requirements, we can use a few key functions to elegantly format our data correctly for our needs.

了解数据操作是JavaScript机器学习的重要组成部分。通过了解我们的用例和需求，我们可以使用一些关键功能来优雅地正确格式化我们的数据以满足我们的需求。

The Tensorflow.js team is continuously changing the underlying data API in Tensorflow.js. This can help accommodate more of our needs as the API evolves. This also means that it’s worth staying abreast of developments to the API as Tensorflow.js continues to grow and be improved.

Tensorflow.js团队正在不断更改Tensorflow.js中的基础数据API。随着API的发展，这可以帮助满足我们的更多需求。这也意味着，随着Tensorflow.js的持续增长和改进，有必要紧跟API的发展。

Originally published at thekevinscott.com

最初发布于thekevinscott.com

Special thanks to Ari Zilnik.

特别感谢Ari Zilnik 。

翻译自: https://www.freecodecamp.org/news/how-to-deal-with-mnist-image-data-in-tensorflow-js-169a2d6941dd/

https://www.dkcj.cn/info/13775.html

如何在Tensorflow.js中处理MNIST图像数据

如何在Tensorflow.js中处理MNIST图像数据 (How to deal with MNIST image data in Tensorflow.js)

MNIST示例 (MNIST example)

`MNISTData` (`MNISTData`)

`load` (`load`)

`imgRequest` (`imgRequest`)

在DOM之外获取图像数据 (Fetching image data outside of the DOM)

只是划伤表面 (Just scratching the surface)

相关文章：

常用图像额文件格式及类型

微信小程序实现滑动翻页效果源码附效果图

Ubuntu 系统文件操作命令

firebase 推送_如何使用Firebase向Web应用程序添加推送通知？

lucene构建同义词分词器

正则匹配出字符串中两串固定字符区间的所有字符

识别手写字体app_我如何构建手写识别器并将其运送到App Store

20155307 2016-2017-2 《Java程序设计》第10周学习总结

WinForm 实现验证码

微信小程序打开预览下载的文件

aws lambda_为什么我会自动删除所有旧的推文以及我用来执行此操作的AWS Lambda函数...

Topcoder SRM 657DIV2

微信小程序换行，空格的写法

我是如何在尼日利亚的沃里创立Google Developers Group GDG分会的，并达到了100位成员...

ES6 你可能不知道的事 – 基础篇

Android线程之主线程向子线程发送消息

HTML上传excel文件，php解析逐条打印输出

javascript编写_如何通过编写自己的Web开发框架来提高JavaScript技能

2016ACM/ICPC亚洲区大连站现场赛题解报告（转）

微信小程序插件新增能力

ubutun:从共享文件夹拷贝文件尽量使用cp命令而不是CTRL+C/V

影像锐化工具_如何以及为什么要进行工具改造：花在锐化斧头上的时间永远不会浪费...

ListT随机返回一个

微信小程序插件功能页开发详细流程

（拆点+最小路径覆盖） bzoj 2150

使用Flow检查React，Redux和React-Redux的全面指南

微信小程序WebSocket实现聊天对话功能完整源码

codevs 1203 判断浮点数是否相等

通过代码自定义cell(cell的高度不一致)

通过构建城市来解释HTML，CSS和JavaScript之间的关系

如何在Tensorflow.js中处理MNIST图像数据 (How to deal with MNIST image data in Tensorflow.js)

MNIST示例 (MNIST example)

MNISTData (MNISTData)

load (load)

imgRequest (imgRequest)

在DOM之外获取图像数据 (Fetching image data outside of the DOM)

只是划伤表面 (Just scratching the surface)

相关文章：

`MNISTData` (`MNISTData`)

`load` (`load`)

`imgRequest` (`imgRequest`)