copt.datasets.load_rcv1

copt.datasets.load_rcv1(md5_check=True, subset='full')[source]

Download and return the RCV1 dataset.

This is the binary classification version of the dataset as found in the LIBSVM dataset project:

Parameters
  • md5_check – bool Whether to do an md5 check on the downloaded files.

  • subset – string Can be one of ‘full’ for full dataset, ‘train’ for only the train set or ‘test’ for only the test set.

Returns

scipy.sparse CSR matrix

y: numpy array

Labels, only takes values 0 or 1.

Return type

X