A Markov-chain activity-based model for pedestrians in office buildings

As the number of people working in office buildings increases, there is an urgent need to improve building services, such as lighting and temperature control, within these buildings to increase energy efficiency and well-being of occupants. A pedestrian behaviour model that simulates office occupants’ movements and locations can provide the high spatial and temporal resolution data required for the testing, evaluation, and optimization of these control systems. However, since most studies in pedestrian research focus on modelling specific actions at the operational level or target situations where movement schedules do not have to modelled, a pedestrian behaviour model that can simulate complex situations over long time periods is missing. Therefore, this paper proposes a tactical level model to generate occupant movement patterns in office buildings. The Markov-chain activity-based model proposed here is data parsimonious, flexible in accepting different levels of information, and can produce high resolution output. The mathematical properties of the methodology are analyzed to understand their impact on the final results. Finally, the tactical level pedestrian behaviour model is face validated using a case study of an imaginary office with a simple layout.


Introduction
Around the world, as more people work in office buildings, these buildings consume a large proportion of resources for building services, such as lighting and temperature control. Therefore, these services need to be researched and evaluated to optimize their performance, for which occupant behaviour data, such as their spatio-temporal distribution within the building, is required. A pedestrian behaviour model that simulates the movements of office occupants can generate this data for various situations without expensive implementation of test sensors in the real-world and minimal privacy concerns. However, most pedestrian studies focus on the operational behaviour level [1] and those that do model a broader picture either assume a movement schedule or model situations where the sequence of movements is already known. To obtain pedestrian locations over long periods of time in complex situations, the tactical behaviour level, where pedestrian itineraries are constructed, must be modelled alongside the operational level. Hence, the next step in pedestrian research is to develop a model that integrates the different behaviour levels [2]. For this, the tactical behaviour level which schedules movements deserves greater attention than it has received.
The few approaches in literature for scheduling pedestrian movements can be divided into (i) activity-based, (ii) location-based, and (iii) random-access models. In the first category, movements take place because pedestrians have to perform activities at different locations. Activity episodes are generated and scheduled for pedestrians. A location choice model is used to assign each activity episode in the itineraries to a location to form movement schedules. Tabak [3] presented the most detailed model of pedestrian behaviour in offices but, like other models in this category, it has been critiqued for its complexity and extensive data requirements. In the next two categories, movement occurs for movement's sake and pedestrians are directly assigned a sequence of locations to visit, making these models less intuitive than activity-based ones. Location-based models generate individual movements using stochastic models with the aim of reproducing location-based statistics, such as occupancy or transition sequences. A common method in location-based models is to divide the area modelled into zones and use them as states of Markov-chains that are assigned to every pedestrian. The Markov-chains are then simulated in discrete time steps and a movement is generated whenever a pedestrian transitions to another zone [4,5]. These models are much less complex and data intensive than the previous category. However, unlike activity-based models, the running time and complexity of location-based models depends on the resolution of the zones and the size of the area modelled. Finally, random-access models use random walks that do not consider locations or activities but depend on the overall spatial configuration to simulate movements [6]. However, they are generally not applicable for situations that are process driven or buildings that are highly programmed and, therefore, not useful for scheduling movements in office buildings. This paper proposes a tactical level model of pedestrian behaviour in office buildings that is data parsimonious, flexible in accepting different amounts of information, and can produce high resolution output; thus combining the advantages of activity-and location-based approaches whilst overcoming their limitations. The focus here is on the scheduling of movements while other decisions at this level, such as route and activity location choice, are simplified with assumptions that office occupants are familiar with the building and indifferent to activity locations, which is reasonable given the types of activities to be performed (e.g. getting coffee) [7]. Furthermore, unlike previous studies [5,8,9] the mathematical properties of the stochastic process used in the model are analysed to understand their effect on the final results. The methodology and its analysis are presented in section 2. In order to face validate the model, a simple case study of an imaginary office is carried out in section 3, followed by the conclusions in section 4 which note model limitations and relevant future work.

Methodology
To combine the advantages of the activity-and location-based approaches, the Markov-chain location-based model (henceforth, the Wang model) proposed in [5] is adopted in an activity-based framework. The Wang model is similar to many models in its category in that it divides the building into zones which are used as states in the Markov-chains assigned to each occupant, and these Markov-chains are simulated to generate movements. However, recognizing that defining each transition matrix element in every agent's Markov-chain is data-intensive, the authors of the Wang model propose a novel method that derives these probabilities from profiles consisting of only two variable sets per agent; thus, considerably reducing the number of inputs required to model multi-zone, multi-occupant scenarios. Other studies [8,9] have used the Wang model to develop occupancy simulation software because of its simplicity and data parsimony.
However, certain limitations prevent the Wang model from being used to simulate pedestrian movements. Being a location-based model, it is not robust against building size and cannot use high spatial resolutions. Furthermore, all movements between zones are assumed to take place in a single time step which lowers the temporal resolution to maintain feasibility of the assumption (a 5 minute time step is used in [5]). These limitations are resolved by adopting the Wang model in an activity-based framework (section 2.2) and adding complementary models (section 2.4). Additionally, their novel method of generating transition matrices is improved by linearizing the optimization problem (section 2.2) and analysing its impact on the final results through simulation (section 2.3).

Activity Classification
Since activities have different properties, they cannot all be generated and scheduled with the same methodology. Thus, to make the activity-based framework possible, a classification of activities is created such that each category contains activities with similar properties whilst all categories together allow representing the full range of activities occurring in various situations [3,10]. These categories are further grouped according to their association with a time-of-day (time-dependent or time-independent) and these groups each have their own scheduler.
Time-dependent activities include planned and time-window activities, which, respectively, have a fixed starting time or a fixed time window within which they may be performed. Planned activities (e.g., meetings) have priority over all other activity types, whereas time-window activities (e.g., arrivals, lunch) have a higher priority than time-independent activities but may be skipped if they cannot be executed in their time window due to planned activities. These activities are modelled using an Event Scheduler. Time-independent activities include recurrent and continuous activities. Recurrent activities are related to occupants' physiological processes [3] (e.g., taking breaks, getting a drink, visiting the restroom) which may be assumed to have a regular recurrence time. It is assumed that these activities are not undertaken in the middle of time-dependent activities, such as meetings. Continuous activities (e.g., working at one's desk), which have the lowest priority, are performed when no other activity is being performed. They do not have a fixed starting time, duration, or period and are the default activity to which occupants return when they are unable to execute other activities. The Markov-chain Scheduler, which is a modification of the Wang model, is used to simulate time-independent activities.

Markov-Chain Scheduler
The methodology used to generate and schedule time-independent activities is based on first-order, discrete time, finite-space ergodic Markov-chains. Instead of the zones, to adopt the Wang model in an activity-based framework, time-independent activities are used as the states of the Markov-chains. Eq. 1 describes the memoryless characteristic of Markov chains with states S indicating that the state of the chain, X, at time t, is only dependent on the state in the previous time-step. Combining pij's for all i and j into matrix form creates a transition matrix P, where each element (pij) indicates the probability of going from the row state (i) to the column state (j). Eq. 2 gives an example of an n-state transition matrix. (1) The model uses two inputs to generate the transition matrix for each occupant. The first input is the average continuous duration of time (also called expected sojourn time) the agent spends in a state, that is, a time-independent activity. For large values of t, (i.e., → ∞) the expected sojourn time (τi) in a state converges to a limiting value which can be used to derive the diagonal elements of the transition matrix as shown in Eq. 3. The second input is the proportion of time spent in a state, that is performing a time-independent activity. For → ∞, this value -the probability of being in a state, which describes, in essence, the overall percent of time a Markov-chain spends in a given state -also converges to a limit. This probability distribution over all the states of a Markov-chain is called the stationary probability distribution (π) (Eq. 4). By definition, the stationary probability distribution can be calculated as the left eigenvector (normalized to 1) of the transition matrix (Eq. 5). For recurrent activities, such as getting a coffee, it is more convenient to use the average time between two episodes of the activity, that is the away time of a state (αi). This variable is also observed to tend to a limit which can be used to derive the stationary probability values of those states (Eq. 6).
For each occupant, a transition matrix is derived by setting up a constrained linear least-squares problem. Linearization increases the speed of the model considerably, thus, reducing the model's running time. The system of linear equations is derived from the following conditions: (i) the relation between the stationary probability distribution and the transition matrix (Eq. 5); (ii) the fact that the sum of columns of a row in the transition matrix sum to 1 (Eq. 2); (iii) the known values of the diagonal elements of the transition matrix (Eq. 3); and (iv) the constraint that all transition elements are non-negative values (Eq. 2). Thus, for an n-state Markov chain, the system of equations consists of 2n equations, formed by the first two conditions, which are solved for n(n -1) variables that represent the elements of the n × n transition matrix (given that the diagonal elements are known from the third condition). Eq. 7 shows the system of linear equations used for the constrained linear least-squares problem and Eq. 8 shows the setup of the problem.
In order to generate the transition matrix, MATLAB's constrained linear least-squares solver lsqlin is used to solve Eq. 8 for x -the array of unknown transition matrix elements. Similar to [5], the problem proposed in Eq. 8 is under-determined, and thus, it has either no solutions or an infinite number of solutions. Since a unique solution is not already defined, the linear problem can accept increasing amounts of information through additional linear constraints of the form ≤ or upper and lower bounds for x in the constrained least-squares problem. This additional information can be in the form of complementary observations, such as the tendency to perform one activity after another. Incorrect solutions are discarded by setting a tolerance value of 10 -10 on the objective function value of lsqlin, which should ideally be zero. Furthermore, since this method uses ergodic Markov-chains, a check is made to detect and remove states with zero stationary probability (i.e., activities that are never performed by an occupant) when generating a transition matrix. After the generation, null rows and columns corresponding to those states are added to maintain consistency with other transition matrices.
Once the transition matrix is generated, a simulation of the Markov chain is carried out by starting from a random initial state (activity) and then choosing the next state at every time step using the transition probabilities from the current state. This simulation returns a sequence of time-independent activities for the occupant. Thus, doing both, generating activity episodes with durations and scheduling them. The next section analyses how the under-determined nature of the problem impacts the final results.

Transition Matrix Generation Analysis
Since the problem setup in Eq. 8 is under-determined, the same inputs will result in more than one transition matrices (if any). This could mean that if the solver resulted in a different transition matrix, the end results could be quite different. However, while the problem setup in the previous section is, strictly speaking, only true for expected values at → ∞, the Markov-chains are simulated for a maximum time period of a day, which is approximately 500 time steps if each time step is a minute (500 minutes ~ 8 hours ~ 1 working day). Hence, the limited time simulations add stochasticity to the results, which could mean that the existence of different transition matrix solutions may not, ultimately (i.e., post-simulation), make a difference. To check this, a three-state situation is considered. Four Markov-chains are obtained from the same inputs and simulated 1000 times for 500 time steps. The transition probability observed from the simulation (i.e., the simulated transition probabilities) between all the states are recorded for each run and plotted as an histogram with different colours representing the four Markov-chains (Fig. 1). Additionally  For smaller values of sojourn times (Fig. 1, top) the differences in the transition matrices can be clearly observed in their simulated transition probabilities, but the differences are much less when the sojourn times are multiplied by 10 (Fig. 1, bottom). This increase in sojourn times reduces the scale of the distributions of transition probabilities to other states, that is, corresponding to the increase in sojourn times the mean values of the transition probabilities to the other states have reduced by 10. Thus, the differences in the transition probabilities of different Markov-chains are less clear. Thus, for states with a high sojourn time, in absolute terms, the transitions from that state will produce similar results for different transition matrices while those with a low sojourn time will be impacted more strongly by the transition matrix choice.

Complementing Schedulers
While the Markov-chain scheduler gives a sequence of time-independent activities performed by an occupant, other schedulers are needed to obtain a complete schedule of an occupant's movements through a day. Complementing the Markov-chain scheduler, are three different supporting schedulers: (i) event scheduler, (ii) movement scheduler, and (iii) re-scheduler. Each of these schedulers is accompanied by a location choice model that decides where activities will be performed. Fig. 2 shows the schedulers within the general framework of the movement scheduler for pedestrians (occupants) in office buildings. First, the event scheduler is used to schedule time-dependent (first planned and subsequently time-window) activities that anchor the itinerary. Activities such as meetings or arrivals and departures may be generated by sampling from aggregated observations such as meeting room booking data and occupancy sensor data, or expectations of the same. For planned activities, the event scheduler interacts with a resource handler to check the availability of a location chosen by the location choice model before reserving it. Once the schedule is anchored with time-dependent activities, such as a meeting and a lunch break, gaps in the itinerary have to be filled. Since the Markov-chain scheduler derivation holds strictly true only for a large number of time steps, instead of simulating the Markov-chain for the time period of each gap individually, it is simulated for the total period of all the gaps in the itinerary. This long sequence of time-independent activities is then split up to fill in the gaps. The location choice model is used to assign locations where each activity will be performed. Simultaneously, between each activity, the movement scheduler assigns the time required to move from one location to another; thus, developing the planned schedule for a day. Finally, when executing the planned schedule, dynamic updates are required; for example, when a pedestrian discovers that a chosen location is occupied and has to choose a new location, or has to skip an activity because all locations are occupied, or when a delay in starting time of an activity requires shifting the schedule. Location availability is checked by the re-scheduler by interacting with the resource handler before updating schedules based on feedback from the operational level model to result in the final executed schedule.

Case Study
A case study of an imaginary consultancy office with a simple layout (Fig. 3)  engineers, 12 junior engineers, and 1 receptionist. All occupants with the same role are assigned the same input profile (stationary probabilities or away times and sojourn times of states). Occupants are assumed to perform the following activities: continuous -being at their desk; recurrent -going to the toilet, getting a drink, taking a break; time-window -arrival, departure, lunch; and planned -meeting. This simple set of activities represents all four categories and is likely to be performed by occupants of most offices.
In order to carry out the first step towards model validation, the following three simulated measures are used: (i) break durations, (ii) time between two episodes (away time) of getting coffee, and (iii) desk occupancy. The first two indicators are compared against the distributions expected from the limiting behaviour of Markov-chain based models while the desk occupancy is judged qualitatively against expected patterns. All measures are calculated as the aggregated output of 10 simulations of the model, that is, 10 days of occupant movements in the imaginary office.
As expected for Markov-chain based models, both distributions show features of exponential distributions. For 10 simulations, the mean break duration is found to be 9.3 minutes while the mean away time for coffee is 84.6 minutes. While the former is close, both values underestimate the true input sojourn times of 10 minutes and 120 minutes respectively. A possible reason for the lower simulated duration may be the splitting of activities from the Markov chain agenda while filling the gaps in the event schedule. Splitting activities would reduce their average duration.  Desk occupancy describes the number of office occupants who are at their desk at a given moment. It is an important parameter for lighting and temperature controls in office buildings because these building services are based on the presence of people in certain areas. Fig. 4 shows the maximum, minimum, and average desk occupancy in a day over 10 simulations. Clear patterns of arrivals at 09:00, lunch break near 12:00, and departures from 16:30 and onwards can be observed. Moreover, the desk occupancy fluctuates throughout the day indicating the natural patterns of people moving about in the office. While these fluctuations seem to be random, as one would expect, the individual activities indeed follow a pattern as indicated by the distributions of the durations of the activities and their away times.

Conclusions
While most previous studies have focussed on the operational behaviour level and have not been able to model complex situations for long periods of time, this paper present a tactical behaviour level model that is able to do that. The model can be used to model pedestrian behaviour in office buildings and optimize building services therein. The tactical behaviour level model uses an activity-based framework which produces the high spatial and temporal resolution movement patterns required for testing and optimizing building services in office buildings while the Markov-chain methodology keeps the model data parsimonious and flexible in accepting increasing amounts of information. Furthermore, the paper analyses the transition matrix generation methodology closer than other studies, revealing that using higher temporal resolution -that is, increasing absolute sojourn times -reduces the effect of its under-determined nature on the final result. Finally, the case study of an imaginary office indicates that the model's results could represent movement patterns in an office building.
Despite the advantages of the model proposed here, it has some limitations which could be paths for future studies. To build connections with several studies on workplace interactions in the field of architecture (e.g. [11]), environment-induced activities -activities induced by the current spatial position of the occupant must be modelled. For example, an occupant may find themselves in sight of a colleague and decide to have an unplanned discussion. Furthermore, currently only the first steps towards model validation have been carried out. Therefore, future studies could focus on quantitative validation of the model.