= 'test'
src_dir = ['spectra-features-smp.npy', 'spectra-wavenumbers-smp.npy',
fnames 'depth-order-smp.npy', 'target-smp.npy',
'tax-order-lu-smp.pkl', 'spectra-id-smp.npy']
= load_kssl(src_dir, fnames=fnames) X, X_names, depth_order, y, tax_lookup, X_id
Loading
Utility function to load MIRS spectra, measured exchangeable potassium and auxiliary data such as depth and Soil Taxonomy order
load_kssl
load_kssl (src_dir:str, fnames:List[str]=['spectra-features.npy', 'spectra-wavenumbers.npy', 'depth-order.npy', 'target.npy', 'tax-order-lu.pkl', 'spectra-id.npy'], loaders_lut:dict={'.npy': <function load at 0x7f2f96962550>, '.pkl': <built-in function load>})
Function loading USDA KSSL dataset focusing here on Exchangeable Potassium (analyte_id=725).
Returns: A tuple (X, X_names, depth_order, y, tax) with: X: spectra (numpy.ndarray) X_names: spectra wavenumbers (numpy.ndarray) depth_order: depth and order of samples (numpy.ndarray) y: exchangeable potassium content (numpy.ndarray) tax_lookup: look up table order_id -> order_name (Dictionary) X_id: unique id of spectra
Type | Default | Details | |
---|---|---|---|
src_dir | str | folder path containing data | |
fnames | typing.List[str] | [‘spectra-features.npy’, ‘spectra-wavenumbers.npy’, ‘depth-order.npy’, ‘target.npy’, ‘tax-order-lu.pkl’, ‘spectra-id.npy’] | filenames to open (in order) |
loaders_lut | dict | {‘.npy’: <function load at 0x7f2f96962550>, ‘.pkl’: |
loaders lookup table |
Loads in one call all required data: the Mid-Infrared spectra (the features), associated exchangeable potassium wet chemistry (the target) and additional data such as wavenumbers name, soil depth and others.
For instance to open a subsample of the dataset (see setup to download the full dataset):
print(f'X shape: {X.shape}')
print(f'y shape: {y.shape}')
print(f'Wavenumbers:\n {X_names}')
print(f'depth_order (first 3 rows):\n {depth_order[:3, :]}')
print(f'Taxonomic order lookup:\n {tax_lookup}')
X shape: (100, 1764)
y shape: (100,)
Wavenumbers:
[3999 3997 3995 ... 603 601 599]
depth_order (first 3 rows):
[[ 0. 1.]
[19. 4.]
[43. 12.]]
Taxonomic order lookup:
{'alfisols': 0, 'mollisols': 1, 'inceptisols': 2, 'entisols': 3, 'spodosols': 4, 'undefined': 5, 'ultisols': 6, 'andisols': 7, 'histosols': 8, 'oxisols': 9, 'vertisols': 10, 'aridisols': 11, 'gelisols': 12}
100, 1764))
test_eq(X.shape, (100,))
test_eq(y.shape, (len(X_names), 1764)
test_eq(100,2))
test_eq(depth_order.shape, (len(tax_lookup), 13) test_eq(