custom numpy dtypes
Published 2025-05-08 • Updated 2025-05-09
Use custom dtypes to convert tabular binary data without converting to lists or objects:
gene_dtype = np.dtype([ ('gene_id', 'U10'), ('expr', 'f4'), ('conf', 'f4')])
data = np.array([ ('a', 5.34, 0.97), ('b', 1.2, 0.96), ('c', 5.1, 0.95), ('d', 10.2, 0.91)], dtype=gene_dtype)You still get zero-copy slicing and can use np.memmap for large files. Generally, use a plain ndarray when working with homogeneous data; use custom dtypes with heterogeneous data that needs to map cleanly to binary files or C structs; use pandas for higher-level analytics.