Analysis of Built Environment Influence on Pedestrian route choice behavior in Dutch Design Week using GPS Data

Visitors not only have specific destinations targeting the Dutch Design Week (DDW) exhibitions distributed all over the city, but also visit the city in between exhibition activities. The mixed environment makes modeling behavior of DDW visitors more complex than shoppers and tourisms only. This research pays special attention to the influence of built environment on pedestrian route choice. The built environment includes building and transportation infrastructure. GPS tracking data and social demographic information were collected during the event. Multinomial logit model and path size logit model are used to analysis route choice behavior. The results show that some built environment factors have significant influence on route choice. Shops are more attractive for aged visitors. Females prefer shorter routes more. In big event, the alternative routes with more sharing links could increase the possibility to choose.


Introduction
The Dutch Design Week (DDW) 2017 attracted around 335,000 visitors from both the Netherlands and other countries during a week with more than around 610 exhibitions in 110 locations spread over the city Eindhoven. Due to specific travel purposes, visitors normally have specific destinations targeting the DDW exhibitions. Meanwhile, they might also visit the city for taking a break, a meal or a touring in between exhibition activities. The mixed environment makes modeling behavior of DDW visitors more complex than shoppers and tourisms only. Will visitors' behavior be influenced by city built environment, except by exhibitions' location? In this perception we research how the exhibitions and city built environment together influence visitors' route choice becomes the interest. Except the OD (original and destination place) and trip character, the shops, transportation facilities and other built environment besides the route may also attract them to change their visiting route and destination. In this paper, we will pay special attention to the route choice between different exhibitions. Knowing how the built environment influences on pedestrian behavior can help the event organizer distribute the exhibitions and facilities more efficiently. Thus, visitors can visit more exhibitions and have a better experience.
For pedestrian behavior, traditionally researchers have mainly focused on socio-demographic characteristics and trip characteristics [1,2]. The most well studied aspects are age, gender, speed, formation of groups, shy away distance, travel purpose and walking direction [3,4]. Pedestrians may also be influenced by their own lifestyles, OD and attitudes [5,6,7]. Recently, from both transportation planning field and public health field, the built environment has gained increasing attention. Built environment represents the real city, containing urban facilities, buildings, transport infrastructure and designs to the entities [8]. It has been proved that built environment has significant some influence on pedestrian behavior [9,10]. It consists of land use patterns, the distribution across space of activities and the buildings that house them, the transportation system, the physical infrastructure such as network characteristics as well as the service provided by the system, urban design, and the arrangement and appearance of physical elements [2,11,12,13].
For route choice, factors may be divided into four categories: network characteristics [11], route characteristics [12,2,13], personal characteristics [1] and trip characteristics (Crow, 1998). Cross effects between items of these categories have been identified. Most pedestrian route choice models are based on shortest route calculations. These route choice models are based on shortest routes in time, formulating deterministic shortest path-based methods, stochastic shortest path-based methods, and constrained enumeration methods [14,15].
Many visitors gather in the city area with numbered exhibitions. Different people will travel between two exhibitions or two activity points (stopping more than 15 minutes). So except the shortest path, there may also have other alternative routes between two exhibitions or two activity points. These alternative routes may have different built environment alongside the routes. These differences of built environment and visitors' choices can help us understand the influence of built environment on route choice. This research will focus on modeling DDW visitors' route choice behavior. Considering walking route choice only, the model will take into consideration the effects of socio-demographic characteristics and the built environment in the city.

Data
Two types of data were collected for this study, which are built environment data and DDW visitors' route choice data. The built environment data includes locations of exhibitions and information of DDW special facilities which can be accessed from the website of DDW, building information, land use patterns, transport system and the physical infrastructure. The information is imported from the latest Open Street Map. Then the links map was built. Each link indicate one segment of street and has its built environment information in the attribute If this link has a bus stop (city bus and DDW bus) or not.

Rent bikes
If this link has a place to rent a bike (city bike and DDW bike) or not.

Car parking
If this link has a car parking or not.

exhPSk
Length of one link

PSk
Common path size for one route in the choice set To collect DDW visitors' route choice behavior data, GPS loggers were used. We selected 4 days out of 7 days (both weekdays and weekend) during the DDW and distributed the GPS loggers randomly to the visitors at an exhibition ticket office near Eindhoven central railway station. We access visitors who come by train, bus, car, bike, walk and other ways. Only travelers with a DDW ticket were asked to join the survey. The GPS logger is small and light, has a button which means it is on and recording the location. It records a location coordinate every 5 seconds. A pencil-paper questionnaire was prepared to collect socio-demographic information, such as age, gender, visiting times for DDW, and familiarity with Eindhoven. The investigator turned on the GPS logger after respondent finished the paper questionnaire. The respondent can put it in a bag, pocket or other places. They were requested to give back the GPS logger at the central railway station.

GPS post processing
The collected GPS data were exported as csv files in GPS Photo Tagger. Then the csv files were imported in QGIS (Quantum Geographic Information System). The real observed GPS data is shown as Figure 1. The activity with stopping more than 15 minutes were recognized as one stop.
The routes between the activity stops were gotten manually. As the pedestrian walking ways have a high density in the city area and the signal of GPS logger were influenced by the high buildings, the GPS data often went too far from the route links on the map. Programing about matching GPS data to real routes in the links map were not suitable in this situation.

Choice set generation
Moreover, the modeling of route choice behavior using revealed choice data normally have to deal with generating a reasonable choice set. There are many well-used methods to identify the alternative routes which are feasible, such as K-shortest path [16], link elimination and penalty [17], branch and bound, simulation and labeling approach [18]. These methods consider more about the route character itself, but less about the city built environment besides the route. However, for pedestrians on streets, especially in the big event, route choice maybe influenced by the buildings, shops, activities and others in the city environment. As so far, there is no any perfect method to study the relationship between built environment and pedestrian route choice. So we can't calculate the alternative routes according to any methods and models.
To get the alternative routes, according to the GPS data, we found many visitors traveled between the same two locations. So observed different routes between two locations can be captured. These different routes could be considered as potential alternative routes for any traveler who walk from the origin location to the destination. In this research, we assume the alternative routes are a combination of the shortest route between two locations and routes chosen by all visitors who passed the same two locations.
The shortest route in this research is calculated based on the shortest distance. As most exhibitions were located in the pedestrian walking area in the city center, and some roads were also closed, so most of the visitors could walk freely, without influence by traffic lights or other limitations. Thus, in the shortest path calculating, we don't consider the elimination and penalty.

Discrete choice modeling framework 2.4.1 Multinomial logit model
The multinomial logit model is estimated on a subset of alternatives [19]. Conditional maximum likelihood estimation is used. The probability function is where Vkn is the utility of route k on individual n, Vln is the utility of route l in choice set C on individual n, Pkn is the probability of individual n to choose route k.

Path size logit model
According to the choice set generation, many alternative routes gather in the limited city area. So, the alternative routes may be overlapped, and share some common links. The commonality can make errors with the utility of alternative routes. So, here we use Path Size Logit Model to measure the similarity in the utility function [20]: Proceedings from the 9th International Conference on Pedestrian and Evacuation Dynamics (PED2018) Lund, Sweden -August 21-23, 2018 where PSk and PSl are the commonality factors, La and Lk are the path size of routes k and l, Гk is the set of links in alternative k, C is the choice set for decision maker, Lk is the length of route k, La is the length of link a, δal is the link-path incidence dummy, equal to one if route l uses links a and zero otherwise, and βps is a parameter to be estimated. The commonality factor PSk expresses similarity of all links in the route and the ratio between link and route lengths. It should decrease when the overlapping between the routes increases. The commonality factor in equations (2) through (3) are always positive, between 0 and 1. Consequently, the estimated parameter βps should be positive to express the reduction of the utility of paths with common links with respect to other routes Also, for unique paths the arguments of the logarithms are equal to one and the commonality factors with log are null [21].

Descriptive statistics
In total 296 groups (565 persons) of visitors joined the GPS data collection. The distribution of their socio-demographic information and planned visiting information is shown in Table 2. Attribute "age" was collected personally in one visiting group. But attribute "familiar and visiting times" were collected by group, not personally. "Familiarity" means how familiar the visitors feel with Eindhoven. "Visit times" is about how many times the visitor have been to DDW before. The respondents are almost equally distribute between female and male. Respondents' age from 19 to 30 years old occupied more than half which indicates that DDW attracts young adult the most. About 63.2% respondents are more or less familiar with Eindhoven. And about half of these respondents have been to the DDW in the past. These respondents somehow know about the routes and built environment. It reduces the randomness in the model about their rout choice.

Multinomial Logit Model estimation
The estimated parameters for variables in the visitors' route choice model appear in Table 3. It includes built environment variables and interaction between built environment and social demographic information. R-project was used to perform the choice model estimation.
The result of the goodness of fit of the Multinomial Logit Model is described based on rho square. The null log likelihood value is -1832.0 and the final log likelihood after including all variables is -1623.0. Value of rho-square is 0.1140. Adjusted rho-square is 0.1599. Length enters the model in log form. It performs better than original length. The estimated parameter is -0.7378, which is negative and significant. It indicates that visitors prefer shorter routes. This is reasonable. The estimated result of interaction of length and female is -0.8147, which is negative and significant. It means that females prefer shorter routes more than males. For the result of interaction of shops and females, we found that the parameter is not significant. It shows that females in DDW have no significant preference on routes with more shops. This is somehow opposed with the cliches. Moreover, we found that aged visitors prefer routes with more shops. As shown in Table 3, the result is +0.0428, positive and significant. What's more, aged visitors are more attracted by routes with larger exhibitions. The result of interaction of exhibition acreage/number and age is 0.0077, which is positive and significant. Most of the exhibition types have no significant influence on the route choice, except design management and industry design. And the type of design management has different influence on visitors with more visit times and high familiarity. This is shown in Table 3. The result of interaction of design management and visit times is +0.1489, which is positive and significant. The result of interaction of design management and familiarity is -0.2124, which is negative and significant. Visitors with more visiting experience prefer routes with more design management. However, visitors with high familiarity with Eindhoven don't prefer routes with more design management.
As to other built environment variables, coffee, restaurant, bus stops and car parking have no significant influence on route choice. The results can be seen in Table 3.
According to the statement above, shops and exhibitions have significant and positive influence on visitors. Then, we can get that built environment exactly have significant influence on route choice. Table 4 shows the estimated results of Path Size Logit Model with built environment variables and social demographic information. R-project was used to perform the choice model estimation.

Path Size Logit Model estimation
The result of the goodness of fit of the Path Size Logit Model is described based on rho square. The null log likelihood value is -1832.0 and the final log likelihood after including all variables is -1572.4. Value of rho-square is 0.1417. Adjusted rho-square is 0.1881.
Length also enters the model in log form. However, the result of length is not the same with result in Table 3. The estimated result is positive and not significant. It somehow means length has no obvious influence on route choice. After interaction with female, the result is negative and significant. It's the same with result in Table 3. Female still prefer shorter routes.
Aged visitors prefer routes with more shops. It's shown in Table 4. The estimated result of interaction of shops and age is +0.0440, which is positive and significant. The result of interaction of rent bikes and familiarity is -0.6525, which is negative and significant. Visitors with high familiarity with Eindhoven prefer routes without service of renting bikes. Because the streets with renting bike points were usually too crowded with visitors who were going to rent a bike during the event.
The same with result of Multinomial Logit Model, most of the exhibition types have no significant influence on the route choice, but except design management, industry design and products design.
And the type of design management also has different influence on visitors with more visit times and high familiarity. This is shown in Table 4. The result of interaction of design management and visit times is +0.1394, which is positive and significant. The result of interaction of design management and familiarity is -0.2126, which is negative and significant. These results are almost the same with result of Multinomial Logit Model in Table 3. Moreover, the result of interaction of products design and age is -0.1174, which is negative and significant. Aged visitors don't like routes with more products designs.
According to the statement above, shops and exhibitions have significant and positive influence on visitors. Then, according to the two models' results, we can get that built environment exactly have significant influence on route choice.
However, the estimated parameter of lnPSk is negative, means routes with more sharing links have a high probability to be chosen by visitors. It's against with the common accepted result of Path Size Logit Modelling theory. This may be caused by the different preconditions in this research. The objects in this research are pedestrians. And their route choices happened in the big event. As stated in last part, the influence factors may be multiple, not only the route characters, but also the city built environment. These may make the results be different from other researches using Path Size Logit Model. The result in Table 4 shows that more sharing links can increase the utility of this route. Routes with more sharing links are more close to each other. We assume that the exhibitions on these routes may have accumulative synergic effects. As the visitors came for exhibitions mainly, the exhibition maybe an important factor. However, the estimated result of exhibition acreage/number is negative and not significant. The correlation of exhibition acreage/number and lnPSk is also not significant. The exhibitions seem to be not the reason which causes the unreasonable parameter of lnPSk.
To check if routes with more sharing links really have a higher utility for visitors, the correlation of lnPSk and appearing times of each routes is calculated. The appearing times of each routes are from the observed GPS data. It means how many times a route was traveled by all visitors. A route with higher appearing times is more popular for the visitors. The calculated correlation is -0.3783. They have a lower correlation. Somehow, it indicates that more popular routes have more sharing links. This can explain the result of lnPSk to some extent.
The result in this research, is different with the route choice in daily situations. In daily situations, each individual's ODs are seldom the same. The popularity of the alternative routes in one choice set have no significant difference usually. Thus, the appearing times of each route traveled by individuals are similar. So, it could not be recognized that routes with more sharing links are more popular in daily situations. However, in the big event DDW, a lot of visitors gathered in the limited area. Many visitors had the same ODs. The appearing times of routes had significant difference. There were some popular routes. According to the GPS data, in one choice set, the highest appearing times of one alternative route can be 74, the lowest times can be 1. The big difference between the alternative routes makes different popularity. More popular routes usually have more sharing links. So, the route choice utility of sharing links in big event could be different with daily situations. When the individuals sharing more common ODs and links, the common factor could increase the utility of the route. This is also mentioned in Frejinger's research [22]. The estimated negative result of lnPSk captures an attractiveness for overlapping paths.

Discussion
The Multinomial Logit Model and Path Size Logit model estimated in this study indicates that pedestrian route choice in big event is influenced by other factors, not only by route characters. In this research, visitors care more about the built environment, like exhibitions and shops. In total, visitors, especially female visitors prefer shorter routes. However, according to the variety built environment factors, length of rout is not the most important factor for visitors.
The model estimated coefficients give insights into relationship between built environment and route choice in big event. And it also helps to know the relationship between the event and the city. Different distribution of event's facilities in the city could generate different pedestrian behaviors. Different pedestrian behaviors could make different influence on the city. Better distribution plan of the event facilities combined with the city environment could give the visitors better experience.
As this research recorded visitors' routes in one visiting group. Most of the visitors traveled with others. The final observed route choice might be influenced by all group members. But, we don't know the weight of different members working on the final decision. In these research, we only consider the interaction relationship between social demographic information and route choice. A better way to analysis the group members and the route choice need to be generated.
The reason why the sign of PS k is not reasonable needs more analysis. More empirical studies need to be estimated.