日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习入门初步——MNIST数据格式如何使用

發(fā)布時間:2025/3/15 pytorch 51 豆豆
生活随笔 收集整理的這篇文章主要介紹了 深度学习入门初步——MNIST数据格式如何使用 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

網(wǎng)上直接下載了MNIST數(shù)據(jù)集

解壓后發(fā)現(xiàn)里面每個壓縮包里有一個idx-ubyte文件,沒有圖片文件在里面。IDX文件格式,是一種用來存儲向量與多維度矩陣的文件格式。

程序

轉(zhuǎn)至:https://blog.csdn.net/Barry_J/article/details/78749620

# encoding: utf-8 """ @author: monitor1379 @contact: yy4f5da2@hotmail.com @site: www.monitor1379.com @version: 1.0 @license: Apache Licence @file: mnist_decoder.py @time: 2016/8/16 20:03 對MNIST手寫數(shù)字?jǐn)?shù)據(jù)文件轉(zhuǎn)換為bmp圖片文件格式。 數(shù)據(jù)集下載地址為http://yann.lecun.com/exdb/mnist。 相關(guān)格式轉(zhuǎn)換見官網(wǎng)以及代碼注釋。 ======================== 關(guān)于IDX文件格式的解析規(guī)則: ======================== THE IDX FILE FORMAT the IDX file format is a simple format for vectors and multidimensional matrices of various numerical types. The basic format is magic number size in dimension 0 size in dimension 1 size in dimension 2 ..... size in dimension N data The magic number is an integer (MSB first). The first 2 bytes are always 0. The third byte codes the type of the data: 0x08: unsigned byte 0x09: signed byte 0x0B: short (2 bytes) 0x0C: int (4 bytes) 0x0D: float (4 bytes) 0x0E: double (8 bytes) The 4-th byte codes the number of dimensions of the vector/matrix: 1 for vectors, 2 for matrices.... The sizes in each dimension are 4-byte integers (MSB first, high endian, like in most non-Intel processors). The data is stored like in a C array, i.e. the index in the last dimension changes the fastest. """import numpy as np import struct import matplotlib.pyplot as plt# 訓(xùn)練集文件 train_images_idx3_ubyte_file = '../../data/mnist/bin/train-images.idx3-ubyte' # 訓(xùn)練集標(biāo)簽文件 train_labels_idx1_ubyte_file = '../../data/mnist/bin/train-labels.idx1-ubyte'# 測試集文件 test_images_idx3_ubyte_file = '../../data/mnist/bin/t10k-images.idx3-ubyte' # 測試集標(biāo)簽文件 test_labels_idx1_ubyte_file = '../../data/mnist/bin/t10k-labels.idx1-ubyte'def decode_idx3_ubyte(idx3_ubyte_file):"""解析idx3文件的通用函數(shù):param idx3_ubyte_file: idx3文件路徑:return: 數(shù)據(jù)集"""# 讀取二進(jìn)制數(shù)據(jù)bin_data = open(idx3_ubyte_file, 'rb').read()# 解析文件頭信息,依次為魔數(shù)、圖片數(shù)量、每張圖片高、每張圖片寬offset = 0fmt_header = '>iiii'magic_number, num_images, num_rows, num_cols = struct.unpack_from(fmt_header, bin_data, offset)print '魔數(shù):%d, 圖片數(shù)量: %d張, 圖片大小: %d*%d' % (magic_number, num_images, num_rows, num_cols)# 解析數(shù)據(jù)集image_size = num_rows * num_colsoffset += struct.calcsize(fmt_header)fmt_image = '>' + str(image_size) + 'B'images = np.empty((num_images, num_rows, num_cols))for i in range(num_images):if (i + 1) % 10000 == 0:print '已解析 %d' % (i + 1) + '張'images[i] = np.array(struct.unpack_from(fmt_image, bin_data, offset)).reshape((num_rows, num_cols))offset += struct.calcsize(fmt_image)return imagesdef decode_idx1_ubyte(idx1_ubyte_file):"""解析idx1文件的通用函數(shù):param idx1_ubyte_file: idx1文件路徑:return: 數(shù)據(jù)集"""# 讀取二進(jìn)制數(shù)據(jù)bin_data = open(idx1_ubyte_file, 'rb').read()# 解析文件頭信息,依次為魔數(shù)和標(biāo)簽數(shù)offset = 0fmt_header = '>ii'magic_number, num_images = struct.unpack_from(fmt_header, bin_data, offset)print '魔數(shù):%d, 圖片數(shù)量: %d張' % (magic_number, num_images)# 解析數(shù)據(jù)集offset += struct.calcsize(fmt_header)fmt_image = '>B'labels = np.empty(num_images)for i in range(num_images):if (i + 1) % 10000 == 0:print '已解析 %d' % (i + 1) + '張'labels[i] = struct.unpack_from(fmt_image, bin_data, offset)[0]offset += struct.calcsize(fmt_image)return labelsdef load_train_images(idx_ubyte_file=train_images_idx3_ubyte_file):"""TRAINING SET IMAGE FILE (train-images-idx3-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000803(2051) magic number0004 32 bit integer 60000 number of images0008 32 bit integer 28 number of rows0012 32 bit integer 28 number of columns0016 unsigned byte ?? pixel0017 unsigned byte ?? pixel........xxxx unsigned byte ?? pixelPixels are organized row-wise. Pixel values are 0 to 255. 0 means background (white), 255 means foreground (black).:param idx_ubyte_file: idx文件路徑:return: n*row*col維np.array對象,n為圖片數(shù)量"""return decode_idx3_ubyte(idx_ubyte_file)def load_train_labels(idx_ubyte_file=train_labels_idx1_ubyte_file):"""TRAINING SET LABEL FILE (train-labels-idx1-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000801(2049) magic number (MSB first)0004 32 bit integer 60000 number of items0008 unsigned byte ?? label0009 unsigned byte ?? label........xxxx unsigned byte ?? labelThe labels values are 0 to 9.:param idx_ubyte_file: idx文件路徑:return: n*1維np.array對象,n為圖片數(shù)量"""return decode_idx1_ubyte(idx_ubyte_file)def load_test_images(idx_ubyte_file=test_images_idx3_ubyte_file):"""TEST SET IMAGE FILE (t10k-images-idx3-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000803(2051) magic number0004 32 bit integer 10000 number of images0008 32 bit integer 28 number of rows0012 32 bit integer 28 number of columns0016 unsigned byte ?? pixel0017 unsigned byte ?? pixel........xxxx unsigned byte ?? pixelPixels are organized row-wise. Pixel values are 0 to 255. 0 means background (white), 255 means foreground (black).:param idx_ubyte_file: idx文件路徑:return: n*row*col維np.array對象,n為圖片數(shù)量"""return decode_idx3_ubyte(idx_ubyte_file)def load_test_labels(idx_ubyte_file=test_labels_idx1_ubyte_file):"""TEST SET LABEL FILE (t10k-labels-idx1-ubyte):[offset] [type] [value] [description]0000 32 bit integer 0x00000801(2049) magic number (MSB first)0004 32 bit integer 10000 number of items0008 unsigned byte ?? label0009 unsigned byte ?? label........xxxx unsigned byte ?? labelThe labels values are 0 to 9.:param idx_ubyte_file: idx文件路徑:return: n*1維np.array對象,n為圖片數(shù)量"""return decode_idx1_ubyte(idx_ubyte_file)def run():train_images = load_train_images()train_labels = load_train_labels()# test_images = load_test_images()# test_labels = load_test_labels()# 查看前十個數(shù)據(jù)及其標(biāo)簽以讀取是否正確for i in range(10):print train_labels[i]plt.imshow(train_images[i], cmap='gray')plt.show()print 'done'if __name__ == '__main__':run()

Fighting!!

總結(jié)

以上是生活随笔為你收集整理的深度学习入门初步——MNIST数据格式如何使用的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。