Github page [Twitter Processing pipeline] [Twitter Collection utils] [Twitter Clustering]

Short Hadoop tutorial

Periodic kernel [GPML] [GPy] [GPy.README] [Paper]

Script for processing Pubmed articles and zoning [Paper]


Political ideology user dataset [Readme]

Paraphrase choice based on user personality traits [Readme]

Twitter Dark Triad (narcissism, psychopathy, Machiavellianism) dataset [Readme]

Twitter Dark Triad (narcissism, psychopathy, Machiavellianism, combined score) prediction models [Readme]

Twitter Profile images with gender, text predicted age and Big Five personality [Readme]

Valence and arousal ratings for 2895 Facebook status updates

Paraphrase choice based on user attributes (age, gender and occupational class)

Facebook statuses annotated with valence and arousal

User occupation Twitter dataset [Readme]

Word clusters (NPMI, Word2Vec, GloVe) [Readme]

EU News summaries [Paper]

Twitter hashtag time series [Paper]

Foursquare frequent users [Paper] [Readme]

UK geolocated users [Paper]

UK cities with geolocation, regions, population and dBpedia links [Paper]

#MSM2013 Tweet entities
Dataset consisting of 4341 tweets annotated with 4 entity types (see Workshop on Making Sense Of Microposts – #MSM2013)