Resource efficient algorithms for message sampling in online social networks

Published in 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS), 2020

Sampling the network structure of online social networks is a widely discussed topic as it enables a wide variety of research in computational social science and associated fields. However, analyzing and sampling contentful messages still lacks effective solutions. Previous work for retrieving messages from social networks either used endpoints that are not available to the general research community or analyzed a predefined stream of messages. Our work uses features of the Twitter API that we utilize to construct a data structure that optimizes the efficiency of requests sent to the social network. Moreover, we present a strategy for selecting users to sample, which improves the effectiveness of our query optimizing data structure by leveraging existing models of user behavior. Combining our data structure with our proposed algorithm, we can achieve a 92% sampling efficiency over long timeframes.

Download paper here