ken.sto {soil.spec}R Documentation

Sample selection based on the Kennard-Stone algorithm

Description

The function chooses points based on Euclidean distance measure most representative samples. One can (i) select a number or a percentage of a sample set or (ii) divide a sample set into calibration and representative validation set.

Usage

ken.sto(inp, per = "T", per.n = 0.3, num, va = "F", sav = "T", path = "", out = "Sel")

Arguments

inp

a numerical matrix or data.frame containing the input spectra

per

a logical value indicating whether the selected samples should be a percentage (given in per.n) or a set number (given in num) of inp. The default "T" takes a percentage.

per.n

a numerical value between 0 and 1.

num

a numerical value between 1 and the sample number minus 1.

va

a logical value indicating whether to select samples out of inp or to divide them into a calibration and validation set.

sav

a logical value indicating whether the function output shall be saved.

path

a character giving the path name where the function output shall be saved.

out

a character giving the function output name, in case sav is "T".

Details

Sample selection is done following and adapted procedure from Kennard & Stone (1969). It is a stepwise procedure by maximizing the Euclidean distance based on the important number of principal components to the objects already chosen. The number of important principal components is selected so that the increase in cumulative explained variance within the next three components is lower than 4 percent. The starting samples are the two extreme samples (most negative and positive ones) of the important principal components.

per.n having a value of 0.4 while va equal to "F" chooses 40 percent of the sample set. When va is equal to "T" the validation set comprises 40 percent of the sample set.

A graph is given back showing the selected samples in the principal component space (only the important PC's). This is the same graphic generated by plot.ken.sto.

Value

ken.sto returns a list with class "ken.sto" containing the following components:

Calibration and validation set

the logical object va.

Number important PC

integer giving the number of chosen important components - important for choosing the starting samples.

PC space important PC

score value matrix of important principal components.

Chosen samples names

chosen sample names when va equal to "F"

Chosen row number

chosen row numbers when va equal to "F"

Chosen calibration sample names

chosen calibration sample names when va equal to "T"

Chosen calibration row number

chosen calibration row numbers when va equal to "T"

Chosen validation sample names

chosen validation sample names when va equal to "T"

Chosen validation row number

chosen validation row numbers when va equal to "T"

Author(s)

Thomas Terhoeven-Urselmans

References

Kennard, R. W. and Stone, L. A. (1969) Computer aided design of experiments. Technometrics 11(1), 137-148.

Examples

## Not run: ken.sto(inp, per = "T", per.n = 0.3, num, va = "F", sav = "T", path = "", out = "Sel")
## Not run: plot(ken.sto)(x,...)
 

[Package soil.spec version 0.2.0 Index]