With the rise of ubiquitous computing power and the Internet
of Things (IoT), there are many new sources of “big data” – data sets that
include millions of records, and thus many new opportunities for exploiting
this type of data. Much of the buzz
surrounding Internet startups is in their ability to capture this type of “crowd
sourced” data from their users, “mine” the data using advanced analytical
tools, and derive value from this data for their users and the company. Now familiar examples are sites like Yelp®,
that capture ratings for millions of users, or Fitbit®, that
captures health information from connected devices worn by their users.
Pavement management has always taken a data driven approach
to decision making, and it is thus not surprising that there are already some
examples for this type of data collection in the pavement management community,
such as international roughness index (IRI) measurement via cell phones and road condition information from
Waze. However, these examples tend to be
small efforts (modelled on the Internet startup culture) rather than organized
efforts by pavement owners and managers to perform systematic collection on a
network. Therefore, there is a need for
a research study to determine how this type of data might be collected,
analyzed, and used in pavement management.
There are three possible sources of this type of data on
pavement networks: user ratings that are manually entered into some system
(such as potholes marked by users in a mapping application); data from devices,
such as cell phones, that might report a proxy for condition (such as the
previously mentioned IRI measurements) and data from connected vehicles
(especially self-driving cars). In
contrast to the highly organized data collection activities usually associated
with pavement management systems (PMS), this type of data tends to be sporadic, prone to higher errors,
lacking proper calibration, lacking systematic quality assurance, and measuring
condition indirectly. In addition, the
data must typically be processed to anonymize the source, which might remove
useful information. On the other hand,
this type of data can be collected in real time across the entire network, and
continuously throughout the year.
While it is easy to get excited about the possibility of
users reporting potholes in real time, so that maintenance forces can address
problems immediately, it is not clear how this data could be effectively used for
an enhanced PMS decision-making from a strategic long-term and network-level
perspective. For example, while continuous measurements of IRI across the
network might be very interesting, they may not influence decision making for
pavement preservation, because IRI is a lagging indicator of condition. However, even a small number of skid
resistance estimates from self-driving vehicles might trigger a treatment.
The objective of this study would be to answer some
fundamental questions about the effective use of crowd-sourced data in pavement
management decision making. This study
would consider the following questions:
What are the likely types of data that might be
crowd sourced? Which current data gaps could be addressed by this data
Which specific network level decision points
(e.g. preservation, rehabilitation, etc.) can be affected by this data? What
are the trigger levels to influence decision-making?
What level of accuracy is needed or acceptable for
these types of data?
What are the possible quality assurance
processes for crowd sourced data?
How to address crowd sourced data quality
deficiencies to make it useful for PMS decision making?
What are the most promising indicators, and how
might future research be focused in those areas?
The biggest potential benefit of the use of crowd-sourced
data in PMS would be in improved timing for preservation interventions. If it were possible to use this data to catch
the very early signs of surface distress, and react quickly to perform
preventive maintenance, this would ensure that the greatest cost/benefit ratios
from funding were achieved. However, the
most likely potential benefits, at least initially, is in the triggering of
maintenance holding actions (such as patching) and in the identification of
small areas of rapid failure, before these require large scale
intervention. The other benefit would be
that data would be collected in areas currently missed by PMS data collection,
such as ramps and connectors on state networks, or in IRI measurements on low-speed
|Index Terms:||Crowdsourcing, Pavement management systems, Data collection, Data mining, Internet, Internet of Things (IOT), Data analysis, Mobile communication systems, Mobile applications, Decision making, International Roughness Index, |