您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python 玩轉數據 2 - NumPy ndarray 數組的創建

編輯：Python

引言

主要介紹 NumPy 多維數組的創建，更多 Python 進階系列文章，請參考 Python 進階學習玩轉數據系列

內容提要：

np.empty(): 創建空的未初始化數組
數組存儲順序
np.zeros(), np.ones(), np.full(), np.eye(), np.arange(),np.linspace(): 按默認給定的值（0，1，給定的其它值）創建數組, 或則矩陣對角線為1，或則從一個線性序列范圍中填充數組
np.array(python_obj): 從一個 python object 中創建
強制向上類型轉換 Coercion Upcasting
np.random.random(), np.random.normal(), np.random.randint(): 創建隨機數數組，從正態分布或均勻分布中隨機浮點數或整數

創建 Empty Array 空數組

numpy.empty 方法用來創建一個指定形狀（shape）、數據類型（dtype）且未初始化的數組

empty(shape, dtype=float, order=‘C’)

參數描述shape數組形狀dtype數據類型，可選order有"C"和"F"兩個選項,分別代表，行優先和列優先，在計算機內存中的存儲元素的順序。

舉例：

import numpy as np
array_default = np.empty (shape = (2,3))
array_default_int = np.empty (shape = (2,3), dtype=int)
array_float_16 = np.empty (shape = (2,3), dtype = np.float16, order = 'F')
array_float = np.empty (shape = (2,3), dtype = float, order = 'F')
print("array_default:{}\n".format(array_default))
print("array_default_int:{}\n".format(array_default_int))
print("array_float_16:{}\n".format(array_float_16))
print("array_float:{}\n".format(array_float))

輸出：

注意, 數組元素為隨機值，因為它們未初始化，所以每次運行的值都會不一樣，也就是說需要用戶手動設置其值。

array_default:[[9.34609111e-307 3.56043054e-307 1.11261027e-306]
[2.33646845e-307 3.44898841e-307 3.22646744e-307]]
array_default_int:[[ 924184752 32765 924189296]
[ 32765 537536802 1042292768]]
array_float_16:[[0.000e+00 1.132e-06 3.731e-05]
[1.229e+04 0.000e+00 0.000e+00]]
array_float:[[1.24610723e-306 1.29060871e-306 9.34604358e-307]
[1.37962320e-306 1.11258446e-306 1.44635488e-307]]

數組存儲順序

一個多維數組在內存中的存儲是按一維線性連續存儲的：
● Fortran-style (column-wise): 緩存被優化按列訪問優先，按列的順序連續存儲數據
● C-style (row-major): 緩存被優化按行訪問優先，按行的順序連續存儲數據

性能影響：
選擇不同的存儲方式會影響數組中個元素的量化計算性能。

例如：計算一個數組每行的元素總合
那麼按行存儲的性能肯定要優於按列存儲的性能

例如：計算一個數組每列的元素總合
那麼按列存儲的性能肯定要優於按行存儲的性能

創建數組:填充預先給定的值

方法說明numpy.zeros(shape, dtype = float, order = ‘C’)填充 0numpy.ones(shape, dtype = None, order = ‘C’)填充 1np.full ((2,5), fill_value = 3.14)填充 fill_value 的值np.eye(N, M=None, k=0, dtype=<class ‘float’>, order=‘C’)N-> 行數; M->列數; k->對角線或斜線的 index;np.arange(start, stop, step, dtype)使用 arange 函數創建數值范圍; start:起始值，默認為0; stop:終止值（不包含）;step: 步長，默認為1;dtype:返回ndarray的數據類型，如果沒有提供，則會使用輸入數據的類型。np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)創建一個一維數組，數組是一個等差數列. start:序列的起始值; stop:序列的終止值; num:要生成的等步長的樣本數量，默認為50;endpoint:該值為 true 時，數列中包含stop值，反之不包含，默認是True;retstep:如果為 True 時，生成的數組中會顯示間距，反之不顯示;dtype:ndarray 的數據類型

舉例：

import numpy as np
aray_zeros = np.zeros(shape = (2,5), dtype = int, order = 'F')
array_ones = np.ones(shape = (2,5), dtype = float, order = 'C')
array_full = np.full(shape = (3, 5), fill_value=3.14, dtype = float, order = 'F')
array_eye_1 = np.eye(3)
array_eye_2 = np.eye(N=4, M=5, k=1, dtype=int, order='C')
array_arange_1 = np.arange(5)
array_arange_2 = np.arange(1, 10, 2)
array_lines_1 = np.linspace(2.0, 3.0, num=5)
array_lines_2 = np.linspace(2.0, 3.0, num=5, retstep=True)
print("aray_zeros:{}\n".format(aray_zeros))
print("array_ones:{}\n".format(array_ones))
print("array_full:{}\n".format(array_full))
print("array_eye_1:{}\n".format(array_eye_1))
print("array_eye_2:{}\n".format(array_eye_2))
print("array_arange_1:{}\n".format(array_arange_1))
print("array_arange_2:{}\n".format(array_arange_2))
print("array_lines_1:{}\n".format(array_lines_1))
print("array_lines_2:{}\n".format(array_lines_2))

輸出：

aray_zeros:[[0 0 0 0 0]
[0 0 0 0 0]]
array_ones:[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
array_full:[[3.14 3.14 3.14 3.14 3.14]
[3.14 3.14 3.14 3.14 3.14]
[3.14 3.14 3.14 3.14 3.14]]
array_eye_1:[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
array_eye_2:[[0 1 0 0 0]
[0 0 1 0 0]
[0 0 0 1 0]
[0 0 0 0 1]]
array_arange_1:[0 1 2 3 4]
array_arange_2:[1 3 5 7 9]
array_lines_1:[2. 2.25 2.5 2.75 3. ]
array_lines_2:(array([2. , 2.25, 2.5 , 2.75, 3. ]), 0.25)

從 python object 中創建數組

NumPy數組描述np.array(list_obj)從 list 列表中創建np.array(tuple_obj)從 tuple 元組中創建np.array(set_obj)從 set 集合中創建np.array(dict_obj)從 dict 字典中創建

舉例：

import numpy as np
array_from_list_1 = np.array ( ['NumPy', 'supports', 'vectorized', 'operations'])
array_from_list_2 = np.array ([[1, 2, 3], [4, 5, 6]])
array_from_list_3 = np.array ( [range(i, i+3) for i in [2, 4, 6]])
array_from_tuple = np.array((1, 2, 3))
array_from_set = np.array({
1, 2, 2, 3})
array_from_dict = np.array({
 'language' : 'Python', "Hobby":"Reading"})
print("array_from_list_1:{}\n".format(array_from_list_1))
print("array_from_list_2:{}\n".format(array_from_list_2))
print("array_from_list_3:{}\n".format(array_from_list_3))
print("array_from_tuple:{}\n".format(array_from_tuple))
print("array_from_set:{}\n".format(array_from_set))
print("array_from_set attribute: shape:{}; ndim:{}; size:{}\n".format(array_from_set.shape, array_from_set.ndim, array_from_set.size))
print("array_from_dict:{}\n".format(array_from_dict))
print("array_from_dict attribute: shape:{}; ndim:{}; size:{}\n".format(array_from_dict.shape, array_from_dict.ndim, array_from_dict.size))

輸出：
注意：從 set，dict 中創建 numpy 數組，是 0 維數組，size 是 1

array_from_list_1:['NumPy' 'supports' 'vectorized' 'operations']
array_from_list_2:[[1 2 3]
[4 5 6]]
array_from_list_3:[[2 3 4]
[4 5 6]
[6 7 8]]
array_from_tuple:[1 2 3]
array_from_set:{
1, 2, 3}
array_from_set attribute: shape:(); ndim:0; size:1
array_from_dict:{
'language': 'Python', 'Hobby': 'Reading'}
array_from_dict attribute: shape:(); ndim:0; size:1

強制向上類型轉換 Coercion Upcastin

我們知道 NumPy 中的數組元素類型必須是一一致的，當數組中的元素類型不一致時，NumPy 會盡可能強制向上類型轉化，除非明確數據類型。

例如：
np.array ( [3.14, 2, 3] ) ->強制轉換成浮點類型 array ( [ 3.14, 2.0, 3.0 ] )
np.array ( [2, 3.14, 3] ) ->盡管第 1 個元素是整型，強制轉換成浮點類型 array ( [ 2.0, 3.14, 3.0 ] )
np.array ([‘1’, 2, 3]) ->強制轉換成字符型 array([‘1’, ‘2’, ‘3’], dtype=’<U1’)
np.array ([‘1’, 2, 3], dtype = ‘int8’) ->按明確給定的數據類型轉換 array([1, 2, 3], dtype=int8)

創建隨機數數組

隨機數組描述np.random.random (size = (3,5))Uniform Distribution of Floats from [0.0, 1.0) -> 從[0.0, 1.0)均勻分布的隨機數np.random.normal (loc = 2, scale = 1, size = (3,5))Normal Distribution動態分布的隨機數, loc: mean; scale: standard deviation (spread or “width”)np.random.randint (low = 0, high = 10, size = (2, 3))Random Integers from an Interval -> 區間隨機整數舉例：

array_random_random = np.random.random(size=(3,5))
array_random_normal = np.random.normal(loc = 2, scale = 1, size = (3,5))
array_random_int = np.random.randint(low = 0, high = 10, size = (2, 3))
print("array_random_random:{}\n".format(array_random_random))
print("array_random_normal:{}\n".format(array_random_normal))
print("array_random_int:{}\n".format(array_random_int))

輸出：

array_random_random:[[0.02164541 0.72035649 0.69718817 0.91684897 0.36921595]
[0.32850542 0.47149381 0.04184023 0.39958435 0.62943443]
[0.62590949 0.42814604 0.11767369 0.71687164 0.51742262]]
array_random_normal:[[1.886419 2.32612002 2.31179881 1.24693448 1.37626366]
[1.54387266 2.0334366 3.68800144 1.45346514 3.36354338]
[3.08620338 1.0974769 3.18664 3.64974797 2.27638157]]
array_random_int:[[9 7 0]
[7 7 7]]