This essay is based on the following paper:
- Newson, Paul, and John Krumm. “Hidden Markov map matching through noise and sparseness.” Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, 2009. [pdf]
Matching GPS points to roads is the foundation of all kinds of traffic measurement using trajectories data as the input. This essay will give a brief introduction to the trajectories map matching method and I also write a project based on Python to use this method to match the data.
Here is a description of the map matching problem, the grey lines in the figure are the road of Ann Arbor while the light blue lines are trajectories data (GPS points). However, we cannot determine which road the GPS point belonged to without map matching and this is what map matching can do.
I have put the project in my GitLab account, this project is open for everyone in Michigan Traffic Lab, feel free to request the right to edit/view this project if you are a member of Michigan Traffic Lab or a student in U of M. I will be glad to share with you if you are interested. [Project Link Here]
The key problem in map matching is the tradeoff between the roads suggested by the location data and the feasibility of the path. One simple method to determine the link the one GPS point belongs to is the find the nearest link among all the candidate links. However, this method ignore the relation between the GPS points and the feasibility of the path.
The following figures give the method in the paper to do the map matching using a Hidden Markov Model. (Figures are also from the reference)
There are two parts of the model, the first is the measurement probabilities:
Then is the transition probabilities:
where the $d_t$ equals to:
Then this problem can be converted to a dynamic programming problem (DP).
Some references here:
The input of the algorithm is the map information and the trajectories data shown as below.
To match the trajectory to the map, we first determine the block zone of the trajectory (red block in the figure) and then remove the road that are away from the block zone. Then we can calculate the distance between the trajectory point and the road. The a threshold was chosen to roughly select the candidate roads. As shown in the figure, the blue points are the trajectory points and the green points are the candidiate projected points. The red points are the map-matching results using dynamic programming.
The following figure gives the result of the dynamic programming. Axis x is index of time stamp (from $t=0$) and axis y is the index of the candidate road. The black lines are the whole network and the red line shows the global optimal solution of the problem (the chosen road).
Here are some map-matching results using the trajectory data with sampled period $T=60s$.