pycvi.datasets.benchmark

Few samples from benchmarking datasets.

Datasets aggregated by Thomas Barton, on his GitHub repository [Bart]:

“target”
“zelnik1”
“long1”

Datasets from the UCR Time Series Classification Archive [UCR]:

“Trace”
“SmallKitchenAppliances”

H. A. Dau, E. Keogh, K. Kamgar, C.-C. M. Yeh, Y. Zhu, S. Gharghabi, C. A. Ratanamahatana, Yanping, B. Hu, N. Begum, A. Bagnall, A. Mueen, G. Batista, and Hexagon-ML, “The ucr time series classification archive,” October 2018. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/

[Bart]

T. Barton, “Clustering benchmarks.” ”https://github.com/deric/clusteringbenchmark”, 2015. [Online; accessed 06-December-2023].

Functions

load_data([fname, data_source, verbose])

Get dataset and labels.

pycvi.datasets.benchmark.load_data(fname: str = 'target', data_source: str = 'barton', verbose: bool = False) → Tuple[numpy.ndarray, numpy.ndarray]

Get dataset and labels.

Parameters:

fname (str) – Filename of the dataset, by default “target”.
path (str, optional) – Path to the file, by default “./Barton/”.
bool (bool, optional) – Verbosity.

Returns:

The dataset and labels

Return type:

Tuple[np.ndarray, np.ndarray]