Cycling Through Artistic Styles

HR

My earlier post on cycling art provided an engaging way to consider the creative potentials of deep learning. I have found myself frequently gravitating back to the idea, using the latest code available over at fast.ai. The method uses a neural network to combine the content of a photograph with the style of an artist, but I have found that it takes a few trials to find the right combination of content versus style. This led to the idea of generating a range of images and then running them together as a movie that gradually shifts between the base image to a raw interpretation of the artist’s style.

Artistic styles

Using a range of artistic styles from impressionist to abstract, the weights that produced the most interesting images varied according to the photograph and artistic style.

My selected best images are shown below, next to snippets of the corresponding artworks. It turned out that the impressionist artists (Monet, Van Gogh, Cézanne and Braque) maintained the content of the image, in spite of being more heavily weighted to artistic style. In contrast, the more monochromatic styles (O’Keeffe, Polygons, Abstract as well as Dali) needed to be more strongly weighted towards content, in order to preserve the cyclist in the image. The selections for Picasso and Pollock were evenly balanced.

Every image is unique and sometimes some real surprises pop up. For example, using Picasso’s style, the mountains are interpreted as rooftops, complete with windows and doors. Strange eyes peer out the background of finger-shapes in the Dali image and the mountains have become Monet’s water lilies. The Pollock image came out very nicely.

Deep learning

The approach was based on the method described in the paper referenced below. Running the code on a cloud-based GPU, it took about 30 seconds for a neural network to learn to generate in image with the desired characteristics. The learning process was achieved by minimising a loss function, using gradient descent. The clever part lay in defining an appropriate loss function. In this instance, the sample image was passed through a separate pre-trained neural network (VGG16), where the activations, at various layers in the network, were compared to those generated by the photograph and the artwork. The loss function combined the difference in photographic content with the difference in artistic style, where the critical parameter was the content weighting factor.

I decided to vary the content weighting factor logarithmically between around 0.1 and 100, to obtain a full range of content to style combinations. A movie was be produced simply by packing together the images one after the other.

References

A Neural Algorithm of Artistic Style, Leon A. Gatys, Alexander S. Ecker, Matthias Bethge

 

 

Deep Learning – Cycling Art

I’ve always be fascinated by the field of artificial intelligence, but it is only recently that significant and rapid advances have been made, particularly in the area of deep learning, where artificial neural networks are able to learn complex relationships. Back in the early 1990s, I experimented with forecasting share prices using neural networks. Performance was not much better than the linear models we were using at the time, so we never managed money this way, though I did publish a paper on the topic.

I am currently following an amazing course offered by fast.ai that explains how to programme and implement state of the art techniques in deep learning. Image recognition is one of the most interesting applications. Convolutional neural networks are able to recognise the content and style of images. It is possible to explore what the network has “learnt” by examining the content of the intermediate layers, between the input and the output.

Over the last week I have been playing around with some Python code, provided for the course, that uses a package called keras to build and run networks on a GPU using Google’s TensorFlow infrastructure. Starting with a modified version of the publicly available network called VGG16, which has been trained to recognise images, the idea is to combine the content a photograph with the style of an artist.

An image is presented to the network as an array of pixel values. These are passed through successive layers, where a series of transformations is performed. These allow the network to recognise increasingly complex features of the original image. The content of the image is captured by refining an initially random set of pixels, until it generates similar higher level features.

The style of an artist is represented in a slightly different way. This time an initially random set of pixels is modified until it matches the overall mixture of colours and textures, in the absence of positional information.

Finally, a new image is created, again initially from random, but this time matching both the content of the photograph and the style of the artist. The whole process takes about half an hour on my MacBook Pro, though I also have access to a high-spec GPU on Amazon Web Services to run things faster.

Here are some examples of a cyclist in the styles of Cézanne, Braque, Monet and Dali. The Cézanne image worked pretty well. I scaled up the content versus style for Braque. The Monet picture confuses the sky and trees. And the Dali result is just weird.

 

References

Trained to Forecast – Risk Magazine, January 1993

Deep Learning for Coders

A Neural Algorithm of Artistic Style, Leon A. Gatys, Alexander S. Ecker, Matthias Bethge