Estimating social relation from trajectories

This study focuses on social pedestrian groups in public spaces and makes an effort to identify the social relation between the group members. We particularly consider dyads having coalitional or mating relation. We derive several observables from individual and group trajectories, which are suggested to be distinctive for these two sorts of relations and propose a recognition algorithm taking these observables as features and yielding an estimation of social relation in a probabilistic manner at every sampling step. On the average, we detect coalitional relation with and mating relation with accuracy. To the best of our knowledge, this is the first study to infer social relation from joint (loco)motion patterns and we consider the detection rates to be a satisfactory considering the inherent challenge of the problem.


Introduction
Crowd has a heterogeneous structure, i.e. it may be constituted of various components (e.g, individuals, groups, commuters, shoppers) with distinct dynamics. Although autonomous agents (e. g., wheelchairs or robots) have recently started to take part in public spaces, in this study we restrict ourselves to crowds constituting of only autonomously walking pedestrians. In that respect, we assume the basic building blocks of the crowd to be (i) individuals and (ii) pedestrian groups.
In the field of pedestrian movement and evacuation dynamics, locomotion of individuals has been studied since a long time. However, the motion of pedestrian groups has started attracting attention only recently, even though they are an important component of crowd dynamics [1,13].
The importance of pedestrian groups is due to their specific dynamics, which distinguishes them from a mere collection of (unrelated) individuals. Moreover, depending on the public space (and thus the context or the scenario), groups may constitute up to more than half of the crowd [1], which increases their significance.
Profiling of groups is important for understanding and analyzing the state of the crowd, for instance for detecting stability, collectiveness or conflict [2,3]. Moreover, automatic resolution of social relation may eliminate or reduce the need of human labeling in the collection of research or application oriented datasets, and can be used by assistive robots providing services to pedestrians (by allowing recognition of targets for such services, e.g. automatically recognizing potential customers such as families, etc.)

Background
In order to understand the dynamics of pedestrian groups, we need to take a closer look into their composition. Namely, several intrinsic features of the group, such as purpose, age, or gender are shown to play a crucial role in their locomotion [4]. In this study, among those intrinsic features, we choose to focus on social relation between the peers.
According to McPhail and Wohlstein, pedestrian groups are people engaged in a social relation to one or more pedestrians and move together toward a common goal [5]. Due to the diversity of the social situations, there is no consensus on a universal, concrete and exhaustive list of social relations. Nevertheless, several categorizations of fundamental forms of social relation have been proposed [6,7,8,9]. While these social science studies define relations using concepts such as benefit of exchange or social domain, in more application oriented fields (e.g. simulation, robotics), the concept of social relation is defined and interpreted in direct relation to (i) the empirically available data and (ii) the specific purpose of utilizing the relation information.
Namely, in such domains, images or videos constitute the data used in identification of social relation. For such reason, the data is often subject to computer vision analysis. Several applications include resolution of kinship relation [10], recognition of domain related roles, such as birthday child and guests, understanding of hierarchical relations between leader/subordinate and interpretation of different social circles such as bikers, hippies, clubbers, etc. [11].
In this study, we choose to focus on social relations which commonly occur among pedestrians in a public environment, trying to use an approach that may fit both to theoretical (social or natural science) studies and to practical applications (automatic recognition for crowd analysis, simulation, robotic applications).
From a collective point of view of the above listed considerations, the approach of Bugental is regarded to be the most convenient model in categorizing social relations [9]. Namely, Bugental proposes a domainbased approach and divides social life into five non-overlapping domains defined as: attachment, hierarchical power, mating, reciprocity and coalitional [9]. When applied to walking pedestrian group classification, the reciprocal domain corresponds to friends, the attachment one to families, the mating one to couples and the coalitional one to colleagues (the hierarchical domain corresponds to a situation that does not fully apply to moving pedestrians in a public space, such as presenter-audience relationship in a seminar room or teacher-pupil relationship in a classroom).
Furthermore, in this study, as a first step to identification of social relation from locomotion, we contain ourselves to two kinds of social relations: mating and coalitional. We examine dyadic groups, which are in one of those relations, and propose an algorithm to distinguish them using a set of observables derived from 3D range data originating from our previous works [4] and [12]. In the future, we aim expanding our scope to the relations of reciprocal and attachment.

Dataset
The dataset used in this study is already introduced by [13] and is freely available at [14]. In what follows, for the integrity of the manuscript, we briefly provide relevant information on the dataset but refer the interested reader to [13] and [4] for a through discussion. The dataset is recorded in an indoor public space covering an area of approximately 900 m 2 in a one year time window. The public space is the ground floor of a business center, which is connected to a train station, a ferry terminal and a shopping center. Therefore, it is populated with pedestrians coming from a diverse background.
Over the course of the data collection campaign, range information is registered for over 800 hours using 3D depth sensors. Using the algorithm of [15], pedestrians are automatically tracked and their position (on 2D floor plane) and height are computed, which can be all downloaded freely from [14]. As a result of the tracking process, the cumulative density map of the environment is found as in Fig. 1(a).
In addition to the range information, we also collected video footage for labeling purposes. Based on the video, three coders (non-technical staff members of one of our institutions) label the dataset with respect to several intrinsic group features. One specific feature refers to the apparent relation, where the possible options are friends, family, couples and colleagues, which correspond to the domains of reciprocal, attachment, mating and coalitional, respectively [9].
Identification of social relation from videos is obviously difficult and subjective. For confirming the interrater reliability of the labeling process, we use the following procedure, whose details are reported in [4].
To reduce the work load on coders, we asked to two of them to label only part of the dataset, while the third coder labeled the entire dataset. Inter-rater reliability is evaluated based on the data analyzed byall coders. Using several prominent statistical measures such as Cohen's , Fleiss' and Krippendorf's , the coders are found to be in considerable agreement [16,17,18,4], and thus we decided to use the labels of the third coder as the basis of our study [4]. As a result, the relation between dyads is distributed as follows: 358 coalitional, 96 mating, 216 attachment and 318 reciprocal.
In this preliminary study, we tried to differentiate between coalitional and mating relations (i.e. "colleagues" and "couples"). This choice is established based on the observations presented in [4], which suggest that coalitional and mating relations present the most distinct features among all combinations of relation pairs. Namely, [4] illustrates that work-oriented dyads move with a significantly larger velocity in comparison to not work-oriented dyads (i.e. mating, attachment, reciprocity relations). On the other hand, [4] proves that the variation on distance between the peers is distributed over a range of values, where coalitional relation is associated with the largest expected value and mating relation is associated with the smallest expected value. In other words, among all relations, on the average, colleagues move with the largest distance, whereas couples move with the smallest distance between peers.
From these dyads in coalitional or mating relation, we require a minimum observation duration of 15 secs (this threshold was introduced to assure that the tracking and interaction time were enough stable, and it is based on considerations relating the nature of the environments, e.g. length of the corridor), which we regard to be sufficiently long to speculate on the social relation. Therefore, we initially consider the 358 dyads in coalitional relation and 96 dyads in mating relation. In addition, from the trajectories of these dyads, we eliminate the portions with unexpected or irregular behavior (such as stopping and waiting or meeting, splitting etc.) using similar criteria to [13]. Specifically, we require a minimum average group velocity of 0.5 m/sec and a maximum interpersonal distance of 2 m. After eliminating the portions of trajectories, which are not in line with our requirements, we derive and contrast several observables from the remaining portions. The details of the definitions of observables and the cumulative empirical observations are presented in Section-4.

Observables and Empirical Distributions
In examining the joint behavior, we focus on the following observables: interpersonal distance, group velocity, velocity difference and height difference of the peers. In what follows, we provide the definitions of these observables on a sample pair ( , ) depicted in Fig. 1(b).

Definition of Observables
The data-set introduced in [4] is based on the tracking system [15], that uses laser range sensor to track pedestrian position in 3D (i.e., including pedestrian height). Based on our previous works [4,14] we decided to use the following observables (x and y are defined, respectively, as the direction of motion of the group, determined by the average velocity, and the "abreast direction" orthogonal to the velocity). Namely, the observables depicted in Fig. 1(b) are defined explicitly as follows: respectively. Henceforth, we drop the indices i and j for the simplicity of notation.

Empirical Distributions of the Observables
The cumulative distribution of interpersonal distance , relating the entire set of dyads in coalitional and mating relation is presented in Fig. 2(a). It is clear that mating dyads stay in closer proximity than coalitional dyads and that their behavior is more "stable" (i.e. less spread) (Fig. 2(a)). In other words the values regarding mating dyads are distributed around a smaller mean and with a lower deviation.
We also took a closer look into on the projections of interpersonal distance along and perpendicular to motion direction [4]. Namely, we denote the projections of on -axis and -axis with and , respectively (see Figure1-(b)). Here, corresponds to the depth of the group, whereas corresponds the abreast distance of the peers. Between coalitional and mating dyads, group depth is found to have no significant difference; while abreast distance and are found to present a similar degree of statistical difference [4], and thus, taking in consideration also computational economy, we decided to consider only the absolute distance observable while ignoring, in this preliminary work, its components. As presented in Fig. 2(b), group velocity of the mating dyads is on average lower than that of the coalitional groups. Moreover, despite being less clear than the distinction of group velocity, also the absolute difference of velocities is found to be different between two social relations as shown in Fig.  3(a) (and confirmed by an ANOVA, as reported below and treated in more detail in [4]) due to the lower maximum and heavier tail.
The last observable of interest, height difference of the peers, Δ , depicted in Fig. 3(b), does not depend on the motion of the peers but rather on their gender in an indirect way. Namely, mating relationship often refers to a heterosexual pair, whereas it is not uncommon for coalitional groups to be composed of same gender peers. In this respect, height difference turns out to be a discriminating feature, since it is higher for mating relation than for coalitional relation (although its effectiveness may depend on cultural factors, i.e. ratios of same gender couples or mixed gender coalitional dyads).
Here, we would like to point out to one certain advantage of using height difference instead of height. Height of individuals strongly varies between societies, while sexual dimorphism (i.e., the tendency of males to be taller) is reported in all human societies [19]. Therefore, using Δ instead of makes the method more flexible and generalizable over different societies.
In addition to these subjective evaluations, we carry out an ANOVA (following the analysis performed in [4])to confirm the inferences mentioned above. All observables of , , , and Δ , are found to have a p-value smaller than 10 −4 . Adopting the widely accepted threshold value of 0.05 for statistical significance [20], we can say that there exists a considerable distinction between coalitional and mating relation in terms of all observables.

Recognition of Social Relation
In this section, we describe our method for discriminating coalitional and mating relations using the observables introduced in Section 4. Specifically, we take a Bayesian stand-point similar to [21] and Here, ( |Σ) is the posterior probability that the dyad comes from relation given the observation set Σ. In addition, (Σ| ) is the likelihood term and ( ) is the prior probability of social relation.
While computing the likelihood, we assume that the four kinds of observables Σ( ) = � ( ), ( ), , Δ � are independent. This assumption enables expressing the likelihood term using the following product, For each conditional probability in Eq. 2, we use the empirical distributions. Namely, we shuffle the dataset and randomly select a subset of the pairs to build the probability density functions.
As for an initial value for our prior belief, 0 ( ), we adopt an equal probability to avoid any bias. Thus, since we have two possible cases for social relation.
As time elapses, we propose updating (or not) the prior as in Eq. 4, where the parameter defines the rate of update.
Regarding the update, we contrast three cases as follows: (i) Update priors to the last computed value (i.e. the posterior) at every step.
(ii) Update priors using a linear combination of the initial value and last computed probability value (iii) No update on the priors The 3 cases described above can be realized using = {0,0.5,1}, respectively.
The term, (Σ), which is called the marginal likelihood, is not necessary to be explicitly computed.
Specifically, we make use of the fact that a particular pair comes either from a or relation and thus the sum of the posterior probabilities, which are scaled by the same term in Eq. 1, need to sum up to 1.

Results
In practice, we randomly choose 30% of the pairs and use their trajectories to build the probability density functions in Eq. 2. The remaining 70% are used to test the ability of our estimation method to recognize relation of dyads outside the training set (of course, the general applicability of the method should be tested in future on different environments and cultural settings). Moreover, repeating this validation procedure 20 times, we compute the mean and standard deviations of performance values to investigate the sensitivity (i.e. dependence) of the observables on training set. By randomly picking 30% of the entire samples and repeating this procedure 20 times, the probability that a particular sample is not used for training falls below 10 −3 .
From the recognition rates presented in Table-1, it is observed that coalitional relation is recognized with a somewhat higher rate for all values of , which could be due to the imbalance of samples in the dataset as given in Section-3 (and possibly on the nature of the observable pdfs, e.g. standard deviations). Moreover, taking a fixed and unbiased prior performs slightly better than applying an update. In addition, the effect of random shuffling is regarded to be minute, which suggests that the observables are stable across samples and the method is resilient to changes in training set. All in all, the proposed method achieves significant accuracy considering the challenge of the problem.

Conclusions
This study describes a method to identify social relation between members of a pedestrian group. We particularly focus on dyadic groups belonging to a coalitional or mating relation. Several observables are derived from individual and group trajectories in addition to height difference of the peers. A recognition algorithm, which uses these data as features, is proposed. Running it over the entire trajectory updating the estimation of social relation in a probabilistic manner at every sampling step, recognition rates are computed. On the average, coalitional relation is detected with 87% and mating relation with 81% accuracy, when the prior is not updated. We believe this is the first study to recognize social relation from trajectory and height information.