MovieLens have published samples of their data through GroupLens to build machine learning and analytics tools.Brief
The "10M" dataset of 10 million movie ratings was chosen for this purpose. Key characteristics of users' preferences were analysed through a number of database queries in Hive and a recommendation model was built using Mahout.Progress
2 posts were produced on liamgavinmurray.com. The first, available here, details what the data means in terms of patterns of user preferences. The second, available here, gives a concrete example on how to build a recommender tool.