CMU Machinelearning Project

Links to several data sets including:

The StarPlus fMRI contains data, software and documentation on the fMRI data set for the StarPlus data.

The Berkeley Segmentation Dataset and Benchmark contains a collection of 12,000 hand-labeled segmentations of 1,000 Corel dataset images from 30 human subjects. Half of the segmentations were obtained from presenting the subject with a color image; the other half from presenting a grayscale image. The public benchmark based on this data consists of all of the grayscale and color segmentations for 300 images. The images are divided into a training set of 200 images, and a test set of 100 images.

The Twenty Newgroups text data set contains 1000 text articles posted to each of 20 online newgroups, for a total of 20,000 articles. This data is useful for a variety of text classification and/or clustering projects.