Loading

Utility function to load MIRS spectra, measured exchangeable potassium and auxiliary data such as depth and Soil Taxonomy order

source

load_kssl

 load_kssl (src_dir:str, fnames:List[str]=['spectra-features.npy',
            'spectra-wavenumbers.npy', 'depth-order.npy', 'target.npy',
            'tax-order-lu.pkl', 'spectra-id.npy'],
            loaders_lut:dict={'.npy': <function load at 0x7f2f96962550>,
            '.pkl': <built-in function load>})

Function loading USDA KSSL dataset focusing here on Exchangeable Potassium (analyte_id=725).

Returns: A tuple (X, X_names, depth_order, y, tax) with: X: spectra (numpy.ndarray) X_names: spectra wavenumbers (numpy.ndarray) depth_order: depth and order of samples (numpy.ndarray) y: exchangeable potassium content (numpy.ndarray) tax_lookup: look up table order_id -> order_name (Dictionary) X_id: unique id of spectra

Type Default Details
src_dir str folder path containing data
fnames typing.List[str] [‘spectra-features.npy’, ‘spectra-wavenumbers.npy’, ‘depth-order.npy’, ‘target.npy’, ‘tax-order-lu.pkl’, ‘spectra-id.npy’] filenames to open (in order)
loaders_lut dict {‘.npy’: <function load at 0x7f2f96962550>, ‘.pkl’: } loaders lookup table

Loads in one call all required data: the Mid-Infrared spectra (the features), associated exchangeable potassium wet chemistry (the target) and additional data such as wavenumbers name, soil depth and others.

For instance to open a subsample of the dataset (see setup to download the full dataset):

src_dir = 'test'
fnames = ['spectra-features-smp.npy', 'spectra-wavenumbers-smp.npy', 
          'depth-order-smp.npy', 'target-smp.npy', 
          'tax-order-lu-smp.pkl', 'spectra-id-smp.npy']

X, X_names, depth_order, y, tax_lookup, X_id = load_kssl(src_dir, fnames=fnames)
print(f'X shape: {X.shape}')
print(f'y shape: {y.shape}')
print(f'Wavenumbers:\n {X_names}')
print(f'depth_order (first 3 rows):\n {depth_order[:3, :]}')
print(f'Taxonomic order lookup:\n {tax_lookup}')
X shape: (100, 1764)
y shape: (100,)
Wavenumbers:
 [3999 3997 3995 ...  603  601  599]
depth_order (first 3 rows):
 [[ 0.  1.]
 [19.  4.]
 [43. 12.]]
Taxonomic order lookup:
 {'alfisols': 0, 'mollisols': 1, 'inceptisols': 2, 'entisols': 3, 'spodosols': 4, 'undefined': 5, 'ultisols': 6, 'andisols': 7, 'histosols': 8, 'oxisols': 9, 'vertisols': 10, 'aridisols': 11, 'gelisols': 12}
test_eq(X.shape, (100, 1764))
test_eq(y.shape, (100,))
test_eq(len(X_names), 1764)
test_eq(depth_order.shape, (100,2))
test_eq(len(tax_lookup), 13)