DatasetAdded By Infochimps
“Features_and_friends.csv” contains 33 image features for 19,217 MySpace profile pictures. Also
included is the number of friends for each user in the sample.
The columns are (roughly):
n – Number of brightness levels
pn – A measure of the distribution of these levels, a higher number means there
are more regions with the same brightness
h,s,v – Average HSV color for the entire image
ch,cs,cv – Average HSV for the central portion of the image
r,g,b – Average RGB levels, as % of median luminosity
sr,sg,sba – Variance in RGB
h1..h4 – The quartiles for luminosity histogram
syx,syy – Measures of symmetry in X and Y axes
csy – Measure of symmetry in the central portion of the image
csyx – Location of vertical axis of greatest symmetry in image center
smx,smy – Measures of 1-pixel smoothness in X and Y directions
bytes – Size of the compressed file (in original file format)
is_gif – Was the original file a GIF
is_jpg – Was the original file a JPG
x,y – Dimensions of the image
ar – Aspect ratio = x/y
px – Number of pixels = x*y
ars – Landscape or Portrait
friends – Number of friends
“Myspace_faces.tar” file contains 64×64 pixel profile pictures taken from 250,000 MySpace users. The data was collected in May 2009. The number of files is slightly less than the total number of users in the sample due to corrupt images being downloaded. The files are stored in 250 directories (000 to 249), each containing 1000 images. The images are renamed to have sequential numbers, with original file names scrubbed to remove anything which may directly identify any MySpace user account. A small percentage of the images include embedded comments within
metadata, but nothing should be personally identifiable. If so, woops.
Note that a significant number of the pictures are animated, and this animation has been retained in the process of converting to 64×64 GIFs. The aspect ratio of the original image has been lost in this process. Sorry. Apart from resizing no other image processing has taken place. The file format conversion (from GIF and JPG) and re-sizing was performed by ImageMagick.
I can’t share how the users were chosen to be included in this sample. And by not telling you, it is safe to consider the selection as random browsing through the friends lists of randomly chosen users.
This file is provided as a resource to researchers and the original images remain the property of whoever currently owns them. All images were accessed via publicly available MySpace image servers and as far as I see it, that makes them useable by the general public.