Tobias Weyand, a computer vision specialist, and a couple of his co-workers at Google have developed a deep-learning machine that determine with a surprising amount of accuracy where a photo was taken, even when no direct information about the location is available. Humans are actually pretty damn accurate, as we can use our advanced problem solving skills and knowledge of the world to take small details in a photo and piece them together to figure out where the photo was taken, which is exactly what Google’s new machine does, albeit far more accurately.
Here’s a tricky task. Pick a photograph from the Web at random. Now try to work out where it was taken using only the image itself. If the image shows a famous building or landmark, such as the Eiffel Tower or Niagara Falls, the task is straightforward. But the job becomes significantly harder when the image lacks specific location cues or is taken indoors or shows a pet or food or some other detail. Nevertheless, humans are surprisingly good at this task. To help, they bring to bear all kinds of knowledge about the world such as the type and language of signs on display, the types of vegetation, architectural styles, the direction of traffic, and so on. Humans spend a lifetime picking up these kinds of geolocation cues. So it’s easy to think that machines would struggle with this task. And indeed, they have. Today, that changes thanks to the work of Tobias Weyand, a computer vision specialist at Google, and a couple of pals. These guys have trained a deep-learning machine to work out the location of almost any photo using only the pixels it contains. Their new machine significantly outperforms humans and can even use a clever trick to determine the location of indoor images and pictures of specific things such as pets, food, and so on that have no location cues.