Program Overview
09:00 - 09:20 Introduction
09:20 - 10:10 Marton Karsai - Socioeconomic correlations and stratification in social-communication networks
10:10 - 10:30 Kyriaki Kalimeri - Automatic Prediction of Youth Unemployment via Social Media Data
10:30 - 11:00 Coffee break
11:00 - 11:20 David Pastor Escuredo - Livelihoods characterization for food security monitoring using big and small data
11:20 - 12:10 José Ramasco - Immigrant community integration in world cities
12:10 - 12:30 Alfredo Morales - Mapping segregation in urban areas
12:30 - 12:50 Yerka Freire Vidal - Hated in the Nation? Quantifying Attitudes Toward Immigration Using Twitter
13:00 - 14:30 Lunch
14:30 - 14:50 Francesca Lipari - Complexity science and social change policies: a theoretical framework for diagnostic
14:50 - 15:10 Julian Vicens - Addressing social justice and unveiling vulnerabilities through game theory and collective experiments
15:10 - 16:00 Daniela Paolotti - It takes a village - how collaborations in data science for social good can make a difference
16:00 - 16:30 Coffee break
16:30 - 16:50 Emanuele Massaro - Resilience management during large-scale epidemic outbreaks
16:50 - 17:10 Martin Minnoni - Using Machine Learning to Uncover the Spread of Chagas Disease in Latin America
17:10 - 17:25 Closing remarks
09:00 - 09:20 Introduction
09:20 - 10:10 Marton Karsai - Socioeconomic correlations and stratification in social-communication networks
10:10 - 10:30 Kyriaki Kalimeri - Automatic Prediction of Youth Unemployment via Social Media Data
10:30 - 11:00 Coffee break
11:00 - 11:20 David Pastor Escuredo - Livelihoods characterization for food security monitoring using big and small data
11:20 - 12:10 José Ramasco - Immigrant community integration in world cities
12:10 - 12:30 Alfredo Morales - Mapping segregation in urban areas
12:30 - 12:50 Yerka Freire Vidal - Hated in the Nation? Quantifying Attitudes Toward Immigration Using Twitter
13:00 - 14:30 Lunch
14:30 - 14:50 Francesca Lipari - Complexity science and social change policies: a theoretical framework for diagnostic
14:50 - 15:10 Julian Vicens - Addressing social justice and unveiling vulnerabilities through game theory and collective experiments
15:10 - 16:00 Daniela Paolotti - It takes a village - how collaborations in data science for social good can make a difference
16:00 - 16:30 Coffee break
16:30 - 16:50 Emanuele Massaro - Resilience management during large-scale epidemic outbreaks
16:50 - 17:10 Martin Minnoni - Using Machine Learning to Uncover the Spread of Chagas Disease in Latin America
17:10 - 17:25 Closing remarks
Detailed Program
09:00 - 09:20 Introduction
09:20 - 10:10 Marton Karsai - Socioeconomic correlations and stratification in social-communication networks
Our understanding about the structure and dynamics of social systems has been developed considerably during the last years due to the recent availability of large digital datasets collecting interactions of millions of individuals. Special focus of research is directed towards mobile communication datasets, which give us the opportunity to explore the structure and evolution of social networks. However, although these studies consider the temporal and spatial characters of human interactions they miss one important dimension regarding the socioeconomic status of individuals, which may largely determine the social structure itself. The uneven distribution of wealth and individual economic capacities are among the main forces which shape modern societies and arguably bias the emerging social structures. In this talk, we will discuss a set of results aiming to close this gap through the analysis of coupled datasets recording the mobile phone communications and bank transaction history of one million anonymised individuals living in a Latin American country. We show that wealth and debt are unevenly distributed among people in agreement with the Pareto principle; the observed social structure is strongly stratified, with people being better connected to others of their own socioeconomic class rather than to others of different classes; the social network appears with assortative socioeconomic correlations and tightly connected ``rich clubs''; and that egos from the same class live closer to each other but commute further if they are wealthier. In addition we will discuss that typical consumption patterns are strongly correlated with identified socioeconomic classes. These results are based on a representative, society-large population, and empirically demonstrate some long-lasting hypotheses on socioeconomic correlations which potentially lay behind social segregation, and induce differences in human mobility.
10:10 - 10:30 Kyriaki Kalimeri - Automatic Prediction of Youth Unemployment via Social Media Data
This study aims at improving the current understanding of the factors driving active job search, placing the focal point on the unemployed population, which often at risk of social marginalisation [3]. The failure to tap into the economic aspirations limits not only the income and skill development but also the likelihood of later employability. The main aim of the study is twofold; firstly, (i) to automatically identify the unemployed population inferring from their online digital traces and secondly, (ii) to uncover digital behaviours of the unemployed community easily accessible from online social platforms, which can indicate the most privileged communication channels for unemployment or educational advertising campaigns. Taking advantage of the popularity of the social platforms we created an ad-hoc Facebook-hosted application, whose major function is to administer psychometric questionnaires and quizzes. Upon acquiring participants' informed consent, we gathered information regarding their public Facebook profile and their "Likes" on Facebook Pages. The application acts an innovative data-collection tool of rich nonverbal cues for behavioural understanding and profiling along with validated information from self-reported psychological assessments. The information gathered included (i) demographic questions (i.e. gender, employment status, residence etc.), (ii) personality traits [1] (iii) moral foundations [2], (iv) questions about the participants' life satisfaction level, and (v) work-life balance status. The goal is to assess how much of the information gleaned by questionnaires is behaviourally observable, which in turns relates to the scalability of the approach since behavioural observation at scale is usually simpler and more cost-effective than large-scale survey campaigns. We employed classification design based on random forest models with 5-fold cross-validation to automatically identify the employment status and the gender of the participants. We inferred these two attributes based only on the Likes on the Facebook pages, the category of the page, and the total activity of the participant (in terms of likes). According to our preliminary findings, the employment state of the participants and the gender is predicted with accuracy (AUROC weighted) 78% and 96% respectively as reported in Table 1. Figure 1 provides a preview of the application. Furthermore, we provide useful insights for the socio-cultural attributes that characterise the two communities in terms of psychometric attributes and interests. For instance, the unemployed were found to be more interested in discount and coupon pages while the employed in political satyr ones. These pages emerged as top indicators of the unemployed status from the internal decisional process of the classification algorithm. Statistically significant differences between the two communities were also noted in their personality traits and moral values. Here, the unemployed found to be more fond of family bonds, tradition and social binding values in general with respect to the employed participants.
10:30 - 11:00 Coffee break
11:00 - 11:20 David Pastor Escuredo - Livelihoods characterization for food security monitoring using big and small data
It is estimated that there are 16 million children suffering from malnutrition world-wide. They are more exposed to diseases communication and also are more vulnerable to external conditions. In this century, still, 2 million children die because of malnutrition every year. Nutrition state can be estimated as a function of the diet and health, therefore, it depends on the food security that may greatly vary among regions in developing countries. Food security can be affected by shocks, socio-economic crisis, market prices and disease outbreaks that have an impact in mobility and accessibility to services.
Livelihoods characterize the means of living and production of regions in developing countries. Each livelihood can be considered as a complex system with its own characteristics and dynamics. The relationship between livelihoods establishes the frame for the national economy and development. We propose a framework to investigate livelihoods using big data (satellite data and Call Detail Records) and small data (household and health surveys). The framework aims at integrating heterogeneous and multi-scale data for systematic analysis of livelihoods using the seasonal calendars of crops, labor markets and herding as the basis for interpretation. The analysis of big data allows for the identification of relevant mobility profiles in a population, whereas the analysis of small data is suitable for estimating accessibility to resources.
Mobile phone data (made available for the D4D Challenge) allows finding distinct human mobility profiles based on the digital trace of mobile phone users characterized by a Matrix of Individual Trajectories (IT-Matrix). This matrix enables multi-scale representations along time and space, which can be used to extract aggregated indicators and census. Unsupervised clustering of individual trajectories generates mobility profiles (clusters of similar individual trajectories) which characterize relevant group behaviors preserving optimal aggregation levels for detailed and privacy-secured mobility characterization. The analysis of monthly mobility patterns at the livelihood zone resolution resulted in the discovery and characterization of seasonal mobility profiles related with economic activities, agricultural calendars and rainfalls. The mobility profiles could support the timely identification of mobility changes in vulnerable populations in response to external shocks.
Nutrition and health household surveys (SMART and SQUEAC /SLEAC provided by ACH) were used to estimate the level of nutrition and accessibility to services. We use livelihoods spatial region to segment the surveys into groups finding out that malnutrition level is modulated along the year by the seasonal calendar and is affected by seasonal mobility as more vulnerable population is left unprotected during labour migration. We also discovered that the reasons for bad access to health services depends on the livelihood characteristics, for instance, the distance to services in livelihoods dominated by pastoralism.
11:20 - 12:10 José Ramasco - Immigrant community integration in world cities
As a consequence of the accelerated globalization process, today major cities all over the world are characterized by an increasing multiculturalism. The integration of immigrant communities may be affected by social polarization and spatial segregation. How are these dynamics evolving over time? To what extent the different policies launched to tackle these problems are working? These are critical questions traditionally addressed by studies based on surveys and census data. Such sources are safe to avoid spurious biases, but the data collection becomes an intensive and rather expensive work. Here, we conduct a comprehensive study on immigrant integration in 53 world cities by introducing an innovative approach: an analysis of the spatio-temporal communication patterns of immigrant and local communities based on language detection in Twitter and on novel metrics of spatial integration. We quantify the "Power of Integration" of cities -their capacity to spatially integrate diverse cultures- and characterize the relations between different cultures when acting as hosts or immigrants.
12:10 - 12:30 Alfredo Morales - Mapping segregation in urban areas
Segregation by income challenge governments and the stability of civil society. Despite its potentially harmful consequences, we do not have a clear understanding of its social dynamics and neither new ways to map its spatial distribution in urban areas. The recent availability of data from electronic media has enabled an unprecedented ability to analyze societies across multiple scales of observation, spanning from individual actions up to large scale collective behaviors. In this work, we analyze geolocated Twitter data in order to measure and locate patterns of social segregation in urban areas. We found that the highest and lowest socioeconomic strata of society are highly segregated from each other and such segregation has an effect on their emerging behaviors and cultural features.
12:30 - 12:50 Yerka Freire Vidal - Hated in the Nation? Quantifying Attitudes Toward Immigration Using Twitter
Immigration has been a key topic in the last decade, due to the importance of how people react to immigration. Fields such as psychology and sociology have dealt extensively with the issue, defining theories that explain the attitudes of people toward immigration: the Intergroup contact theory, and the Integrated threat theory. The contact theory suggests that some people exhibit positive attitudes that arise from the idea that immigration generates cultural and economic enrichment. Conversely, the threat theory suggests that negative attitudes are explained by the competition in access to scarce resources. Given that some of these attitudes can lead to acts of violence, discrimination and abuse, both, between local people, and toward immigrants; it is important to understand and identify factors that enhance these attitudes. This knowledge would prevent, inform and even support public policies that help mitigate negative attitudes and enhance positive ones.
13:00 - 14:30 Lunch
14:30 - 14:50 Francesca Lipari - Complexity science and social change policies: a theoretical framework for diagnostic
The world is rife with collective practices that are harmful, maladaptive and may violate human rights. For example, Female Genital Mutilation, Lack of Sanitation, Child Marriage, Corruption or Racketing practice. It is often argued that the causes of the persistence and dynamics of these practices can be found in the social, political and economic conditions of the countries in which it is practiced. Yet, improvements in economic and socio-political conditions are not necessarily associated with a decrease or disappearance of the practices, especially when such practice is supported by shared social expectations and complex interactions. This is the case, for example, when FGM/C is a well-established tradition related to gender or group identity, to shared beliefs about beauty, health and cleanliness, or is otherwise supported by norms of purity, honor and fidelity. On its account, corruption is a complex social phenomenon and the motivations to engage in corrupt behavior are multifaceted and is the result of interactions at the micro-, meso-, and macrolevel (Bicchieri and Ganegonda, 2016; Dimant and Schulte, 2016). What all these practices have in common is their collective practice nature that is ultimately a cluster of individual behaviours. So that, if we want to understand it, and we desire to implement policy for change, we have to understand why individuals behave in certain ways. There are several features that identifies people’s behavior. One important aspect of people’s behavior is that it is often influenced by what others do, i.e. interactions, and by what others think should be done, i.e. a social norm (Bicchieri, 2006, 2014). Another feature is that interactions have effects in time generating feedbacks: actions in a given moment reflect on possibilities and constraints in the following moments. These interactions not only influence the way individuals learn and adapt their behaviours from others, but interactions alter people’s preferences and decision-making in an unpredictable manner, generating bottom up emerging social practice. Hence, collective practices can be easily analysed through the lenses of complex systems. But what about the diagnostic of the practice? Are the diagnostic tools aligned with the analytical framework of complex systems? And what about the public policies necessary to correct those harmful practices? Are the policy design making the use of complex systems? Policymakers operate in a neoclassical framework for the most part. They tend to evaluate various policy interventions by estimating the impact a given policy change might have on the economy and comparing this to what would happen in the absence of that policy being pursued. Complexity economics on the other hand suggests that since the economy is a complex, adaptive and dynamic system, it is inherently difficult to predict outcomes and responses to particular policy changes (Ormerod 2010). This presents immediate challenges for policymakers. A deeper understanding of the relationship between macro outcomes and individual decisions is therefore needed for policy formulation. Solutions under complexity tend not to be based on deductive analysis or top-down approaches, but explore interaction and behaviour using a bottom-up approach. To illustrate how to add complexity tools to the diagnostic of social change policies, I will propose several case studies of bottom-up social change.
14:50 - 15:10 Julian Vicens - Addressing social justice and unveiling vulnerabilities through game theory and collective experiments
Vulnerabilities of both disadvantaged communities and individuals can be visualized and revealed in many forms and throughout a wide variety of contexts. Vulnerabilities of given collectives or particular individuals that play specific roles within a community can emerge around important societal challenges such as climate change and its mitigation actions, public health and the current trend to reinforce care-at-home services, and also on the health impact on citizens due to strong inequalities in city air quality. Existing palette of strategies to identify the emerging inequalities can be indeed complemented by measuring tensions within a community or a context by means of a game-theoretical approach. This perspective allows to observe and analyze the emergency of behaviours from the interaction between individuals when facing a set of proposed social dilemmas. It is in this way possible to measure multiple behavioural traits such as trust, reciprocity, risk-aversion or cooperation. During the last years, we have been exploring this perspective in the study of social systems through a set of lab-in-the-field behavioural experiments when coupled with citizen science practices. The collective experiments been developed have the potential to not only understand how social systems interact, but also which is the influence of social and contextual factors on the adaptation, vulnerability and resilience of individuals with regard to specific social issues within a given community. We aim to illustrate these interventions with three specific social experiments. The first experiment [1] is The Climate Game, a collective experiment that proposes a “public goods games” to a general population (n=320) in order to study the effects of economic inequalities between participants when facing a global problem such as climate change. We found that the effort distribution was highly inequitable, with participants with fewer resources contributing significantly more to the public goods than the richer ones, consequently increasing the inequalities among them (Fig.1). The second experiment [2], Games for Mental Health, addresses the vulnerability of the persons with mental disorders and proves the potential of the mental healthcare community model. It places individuals with mental disorders at the core of the analysis while emphasizing the crucial role that relatives, caregivers, professionals, and the community as a whole play in the recovery process. To this end, a number of behavioral experiments were carried out within mental health communities around Catalunya (n=270) to study the behavior of participants with mental disorders and all members of their proximate surrounding in terms of cooperation, optimism, trust, reciprocity and collective sense. A clear-cut association emerged between individual behavior and the role played in the recovery process. One of the main conclusions is that community care perspectives should take into account group composition to deal with potential inequalities arising from different capabilities (Fig.2). The last experiment introduced is Games xAire, whereby we point to inequalities in exposure to NO2 concentration across districts in Barcelona. We then explore the correlation between unequal exposure to air pollution and individual willingness to reduce its impact. We propose a “public goods game” to the participants (n>300), whose common objective is to contribute to a common fund in order to take measures to reduce NO2 levels in the city. We introduce economic inequalities between the participants based on the air quality in their respective neighborhoods with the objective of studying who bear the cost of the collective action.
15:10 - 16:00 Daniela Paolotti - It takes a village - how collaborations in data science for social good can make a difference
The unprecedented opportunities provided by data science in all the areas of human knowledge become even more evident when applied to the fields of social innovation, international development and humanitarian aid. Using social media data to study malnutrition and obesity in children in developing countries, using mobile phones digital traces to understand women mobility for safety and security, harvesting search engine queries to study suicide among young people in India: these are only a few of the examples of how data science can be exploited to solve issues around many social problems and support global agencies and policymakers in implementing better and more impactful policies and interventions. Nevertheless, data scientists alone cannot be successful in this complex effort. Greater access to data, more collaboration between public and private sector entities, and an increased ability to analyze datasets are needed to tackle these society’s greatest challenges. In this talk, we will cover examples of how actors from different entities can join forces around data and knowledge to create public value with an impact on global societal issues and set the path to accelerate the harnessing of data science for social good.
16:00 - 16:30 Coffee break
16:30 - 16:50 Emanuele Massaro - Resilience management during large-scale epidemic outbreaks
Assessing and managing the impact of large-scale epidemics considering only the individual risk and severity of the disease is exceedingly difficult and could be extremely expensive. Economic consequences, infrastructure and service disruption, as well as the recovery speed, are just a few of the many dimensions along which to quantify the effect of an epidemic on society’s fabric. Here, we extend the concept of resilience to characterize epidemics in structured populations, by defining the system-wide critical functionality that combines an individual’s risk of getting the disease (disease attack rate) and the disruption to the system’s functionality (human mobility deterioration). By studying both conceptual and data-driven models, we show that the integrated consideration of individual risks and societal disruptions under resilience assessment framework provides an insightful picture of how an epidemic might impact society. In particular, containment interventions intended for a straightforward reduction of the risk may have net negative impact on the system by slowing down the recovery of basic societal functions. The presented study operationalizes the resilience framework, providing a more nuanced and comprehensive approach for optimizing containment schemes and mitigation policies in the case of epidemic outbreaks.
16:50 - 17:10 Martin Minnoni - Using Machine Learning to Uncover the Spread of Chagas Disease in Latin America
We use mobile phone records for the analysis of mobility patterns and the detection of possible risk zones of Chagas disease infections in two Latin American countries. We show that geolocalized call records are rich in social and individual information, which can be used to infer whether an individual has lived in an endemic area. We present two case studies using data provided by mobile phone companies from each country. The risk maps that we generate can be used by public health campaign managers to target specific areas and allocate resources more effectively. Finally, we show the value of mobile phone records to predict long-term migrations, which play a crucial role in the geographical spread of Chagas disease.
17:10 - 17:25 Closing remarks