Style Transfer iOS Application Using Convolutional Neural Networks
Training a Style Transfer Model using Turi Create to Create Artistic Images
Neural style transfer allows you to recover the “style” of an image and apply it to content another. This allows developers, with very little effort, to copy the style of a great master and apply it to the picture of their cat (as just one example). Very interesting perspective!
Neural style transfer, or style transfer, has recently become quite popular, especially with the notoriety of applications such as Prisma. It emerges from a context of strong development of neural networks for various applications, and especially for art. And a few months ago, Deep Dream appeared — a program that highlights non-existent patterns in images, creating what could be considered an artistic style in its own right.
This article will cover a bit of theory, then describe the step-by-step the creation of an iOS application and the training of a simple style transfer model.
Why a mobile application?
Let’s imagine you’ve built an application that’s centered on user generated content (mainly images), and you want to give your users the ability to add some style to their photos—the idea is to always give the user an incentive to be creative with their content and drive more engagement—and on-device style transfer makes this kind of continued engagement possible.
I can also picture different styles for every market—for example “Claude Monet” for the french market or “Edward Hopper” and “Andy Warhol” for U.S. market. The use cases are endless. You can also give the user the power to create their own style—this won’t be covered in this tutorial, but it’s an enticing possibility.
Why Turi Create ?
Throughout this tutorial, I’ll be using Turi Create, which is a high-level Python library that simplifies the development of custom machine learning models. Most importantly, you can export your model into a .mlmodel file that can be parsed by Xcode and used on-device. There are, of course, other methods for building mobile-ready models, which are mostly server-side solutions and use advanced techniques.
What is style transfer?
In an article published in September 2015, researchers from Tübingen, Germany, and Houston, TX, introduced an algorithm using deep learning to create “high-quality perceptual art images”. Their article introduced the idea that the representation of style and content can be separated into a certain type of neural network. This opened the pathway to neural style transfer, which was then quickly extended and improved by other articles— optimizations, applications for sound, and even applying styles to video.
To achieve this feat, the authors of the original paper built a network capturing the texture information of an image, but not the elemental organization of it. Once this texture information is stored in a network, it’s possible to apply it to a different image.
Style transfer is an optimization problem: we try to apply a pre-calculated model (the style) to an input image. For this, we define an objective function (loss function) that we want to minimize—this is a weighted sum of the error (loss) between the original image and the produced image, and the error between the original style and the applied one. By adjusting the weighting parameters, we can give more importance to the original image (the content image) or the style used (extracted from the style image).
That’s where deep learning comes in handy. The primary idea that underpins deep learning is to structure tasks in layers related to each other, performing operations of different levels of abstraction. For example, an image recognition network could consist of a layer working on the pixels, connected to a layer recognizing simple borders, which is itself connected to a pattern recognizing layers, then parts of objects, then objects, etc.
As a multi-layer system, deep learning is particularly subject to the diversity of possible topologies. For style transfer, the main network used is VGG (Visual Geometry Group). It’s a network of 16 layers of neurons, known to obtain good results in image recognition. You can check out the architecture of the network used in Turi Create.
Transform smartphone photos and videos into artistic masterpieces with Fritz Style Transfer. Start building with a free Fritz AI account.
Train the model
We’re going to use the build in method offered by Turi Create.
Turi Create: A Python library that simplifies the development of custom machine learning models. More specifically, you can export your model into a .mlmodel file that can be parsed by Xcode.
Here are all the steps:
- GPU use: I’m using all the GPU I have on my computer (-1), but you can change usage to (0) if none or (n) for the number of GPUs you want to use.
- Load the styles: Loading the folder that contains all the style images.
- Load the training images: Loading the folder that contains all the content images that the model will train on to perfect the style transfer.
- Create the model: Turi Create will do this work for us — we just need the training data, the style images, and the number of iterations (10,000 by default). I chose to try 10,000, 20,000, and 30.000 iterations. You’ll see shortly that this will make a huge difference. I also chose to use “Data Augmentation” which will resize, crop, and rotate images to help diversify the dataset.
- Test the model: Loading images to test the model with all the styles.
- Save the model: Saving the model so that we can use it later if we want to export it in another format.
- Export a .mlmodel file: That’s the file format that can be parsed by Xcode for our iOS application.
Evaluate different models
I have used two styles:
- Style number 1: It’s a Moroccan Zellige, which is a technique typical of Maghreb architecture that consists of assembling pieces of enameled terracotta tiles of different colors to achieve a geometric decoration. The shards of faience are sometimes so fine that it’s a true ceramic marquetry.
- Style number 2: An art piece made by a young Moroccan artist. You can check out his Instagram page.
- Style number 1:
- Style number 2:
We can clearly see that the number of iterations has an effect on the quality of the transfer of textures from the style image to the input image. Some would also argue that we can increase the number of content images—in may case I’ve used 19 images, but try to include as many images as you can.
Interested in mobile development powered by machine learning? Sign up for the Heartbeat Newsletter and join the largest community focused on mobile ML.
I’ve been training with the free Tesla K80 GPU offered by Google, and it’s still a lot of calculation. What’s interesting is that the training time will increase linearly with the number of iterations, which is good.
It took some time to force Turicreate to use the GPU on Colab, but it’s working perfectly now.
Here’s the Colab Notebook
Build the iOS application
Create a new project
To begin, we need to create an iOS project with a single view app, make sure to choose Storyboard in the “User interface” dropdown menu (Xcode 11 only):
Now we have our project ready to go. I don’t like using storyboards myself, so the app in this tutorial is built programmatically, which means no buttons or switches to toggle — just pure code 🤗.
To follow this method, you’ll have to delete the main.storyboard and set your SceneDelegate.swift file (Xcode 11 only) like so:
With Xcode 11 you’ll have to change the Info.plist file like so:
You need to delete the “Storyboard Name” in the file, and that’s about it.
Now let’s set our ViewController with the buttons and a logo. I used some custom buttons in the application — you can obviously use the system button.
First, you need to inherit from UIButton and create your own custom button — we inherit from UIButton because the custom button ‘is’ a UIButton, so we want to keep all its properties and only inherit to change the look of it:
BtnPleinLarge is our new button, and we use it to create our main two buttons for ViewController.swift, our main view.
I have two styles in my model, so I’ll make one button for each style.
Now, set the layout and buttons with some logic as well:
We now need to set up some logic. It’s important to change the Info.plist file and add a property so that an explanation of why we need access to the camera and the library is given to the user. Add some text to the “Privacy — Photo Library Usage Description”:
Of course, you need to set up the layout and add the subviews to the view, too. I’ve added a logo on top of the view as well:
Output ViewController: Where We Show Our Result
Here, we need two things:
- Our transformed image:
2. A button to dismiss the view:
Don’t forget to add the subviews to the main view and set up the layout, too.
Set up the delegate
Before we can pass the image through the model, we need to convert the original the image to a 256x256 square image, which is the format expected by the model. I choose to use a square image so that I don’t lose much of the quality, and I also noticed that the model can support sizes up through 1024x1024, which is the size I choose. That means the image will have decent quality, unlike the pixelized 256x256 image.
Here’s our helper function:
Now that we have our helper, we can access the image from the library with the help of ImagePickerController:
I would say that any application that does any kind of image processing should have some kind of filters or style transfer, because users are steadily coming to expect it right inside the application.
With the help of Turicreate, developers don’t have any excuses to not implement it. One thing to remember is that the artistic images should be interesting enough to transfer compelling textures to the input image.
The biggest problem with style transfer is the amount of processing power needed to perfect the style. You can still use the free Tesla K8 GPU offered by Google, but it’s still comes up short when fine tuning the model to optimize for the best results. I would say that if styling is a core feature in your application, you can definitely invest in a GPU tower.
If you liked this piece, please clap and share it with your friends. If you have any questions don’t hesitate to send me an email at firstname.lastname@example.org.
This project is available to download from my Github account
And the Colab notebook to use the free GPU offered by Google:
Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to exploring the emerging intersection of mobile app development and machine learning. We’re committed to supporting and inspiring developers and engineers from all walks of life.
Editorially independent, Heartbeat is sponsored and published by Fritz AI, the machine learning platform that helps developers teach devices to see, hear, sense, and think. We pay our contributors, and we don’t sell ads.
If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and Heartbeat), join us on Slack, and follow Fritz AI on Twitter for all the latest in mobile machine learning.