Google Brain super-resolution image tech makes “zoom, enhance!” real

Google Brain

Google Brain has devised some new software that can create detailed images from tiny, pixelated source images. Google’s software, in short, basically means the "zoom in… now enhance!" TV trope is actually possible.

Google Brain

First, take a look at the image on the right. The left column contains the pixelated 8×8 source images, and the centre column shows the images that Google Brain’s software was able to create from those source images. For comparison, the real images are shown in the right column. As you can see, the software seemingly extracts an amazing amount of detail from just 64 source pixels.

Of course, as we all know, it’s impossible to create more detail than there is in the source image—so how does Google Brain do it? With a clever combination of two neural networks.

The first part, the conditioning network, tries to map the the 8×8 source image against other high resolution images. It downsizes other high-res images to 8×8 and tries to make a match.

  • Left column: source image. Other columns: various outputs produced by the neural networks. There’s a bit of variation. Fourth celebrity from the bottom is particularly scary.


    Google Brain

  • Input: left column. Fourth column: the original image. Other columns: various super-resolution techniques. NN = nearest neighbour (looking for a high-res image in the dataset that closely matches the 8×8 image)


    Google Brain


  • Google Brain

  • Various different super-resolution techniques. The three right-most columns are the Google Brain method.


    Google Brain

The second part, the prior network, uses an implementation of PixelCNN to try and add realistic high-resolution details to the 8×8 source image. Basically, the prior network ingests a large number of high-res real images—of celebrities and bedrooms in this case. Then, when the source image is upscaled, it tries to add new pixels that match what it "knows" about that class of image. For example, if there’s a brown pixel towards the top of the image, the prior network might identify that as an eyebrow: so, when the image is scaled up, it might fill in the gaps with an eyebrow-shaped collection of brown pixels.

To create the final super-resolution image, the outputs from the two neural networks are mashed together. The end result usually contains the plausible addition of new details.

Google Brain’s super-resolution technique was reasonably successful in real-world testing. When human observers were shown a real high-resolution celebrity face vs. the upscaled computed image, they were fooled 10 percent of the time (50 percent would be a perfect score). For the bedroom images, 28 percent of humans were fooled by the computed image. Both scores are much more impressive than normal bicubic scaling, which fooled no human observers.

One of the best videos of all time.

It’s important to note that the computed super-resolution image is not real. The added details—known as "hallucinations" in image processing jargon—are a best guess and nothing more. This raises some intriguing issues, especially in the realms of surveillance and forensics. This technique could take a blurry image of a suspect and add more detail—zoom! enhance!—but it wouldn’t actually be a real photo of the suspect. It might very well help the police find the suspect, though.

Google Brain and DeepMind are two of Alphabet’s deep learning research arms. The former has published some interesting research recently, such as two AIs creating their own cryptographic algorithm; the latter, of course, was thrust into the limelight last year when its AlphaGo AI defeated the world’s best Go players.

DOI: arXiv:1702.00783 (About DOIs).

This post originated on Ars Technica UK

from Ars Technica http://ift.tt/2kIFpTe
via IFTTT

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.