iOS Cat and Dog Image Classifier With CoreML and Keras

Deep learning is a popular and interesting subset of Machine Learning. Deep learning brings neural networks into the limelight. Many complex tasks just as image classification, speech recognition etc can be achievable with the help of Deep Learning. We’ll be focusing on Image Classification only in this post.

I’m no Data scientist so won’t be digging deep into Deep Learning and Neural Networks.

The goal of this post is to create an image classifier model to differentiate between cats and dogs.

Technologies Used

  • Keras – It’s a high-level neural network API in Python with Tensorflow as its backend.
  • CNN Model – Convolution Neural networks are the preferred models to be used for image classification.

    It starts by learning the low-level features and goes on to learn specific complex features in the deeper layers. A CNN Model takes input as a matrix [IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNEL].

  • Google Colabs – For training our model. It provides a free GPU!
  • coremltools – For converting the .h5 to .mlmodel.

Approach

We already have a small training data set for cats and dogs.
We’ve trained our model using the data on Google Colab. Our model accuracy was 70%. This isn’t that bad since the data set is small and we’d run the model on 5 epochs only. You can try to increase the number of epochs for better accuracy.

  • Note: One Epoch is when a complete dataset is passed forward and backward through the neural network. It’s typically based in batches.
  • You can find and use the data set, model, training, and prediction python scripts from the source code at the end of this article

    Converting Keras model to Core ML

    We can easily convert the Keras model from above into .mlmodel using coremltools.
    In order to install coremltools, run the following command from your terminal.

    pip install coremltools

    The following python script contains the code for model conversion.

    import coremltools
    
    coreml_model = coremltools.converters.keras.convert('model.h5', input_names=['image'], output_names=['output'],image_input_names='image')
    
    coreml_model.author = 'Anupam Chugh'
    coreml_model.short_description = 'Cat Dog Classifier converted from a Keras model'
    coreml_model.input_description['image'] = 'Takes as input an image'
    coreml_model.output_description['output'] = 'Prediction as cat or dog'
    
    
    coreml_model.save('catdogcoreml.mlmodel')
    
    

    Run the above python script ensuring the .h5 file is in the same folder location.
    Note: At the time of writing this article, coremltools doesn’t work with Python3.

    Once our Core ML model is created, it’s time to build the iOS Application!

    Storyboard

    Code

    Drag and drop the mlmodel we created earlier in your Project Navigator.
    The Swift class files for the core ml model are auto generated.

    Ensure that you’ve set the Privacy – Camera usage description field in the info.plist file.

    The code for the ViewController.swift is given below:

    import UIKit
    import CoreML
    
    enum Animal {
        case cat
        case dog
    }
    
    
    class ViewController: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate {
    
        @IBOutlet weak var modelOutputLabel: UILabel!
        private let model = catdogcoreml()
        @IBOutlet weak var imageView: UIImageView!
        private let trainedImageSize = CGSize(width: 150, height: 150)
    
        override func viewDidLoad() {
            super.viewDidLoad()
            // Do any additional setup after loading the view.
        }
    
        @IBAction func takePhotoClicked(_ sender: Any) {
        
            let imagePicker = UIImagePickerController()
            imagePicker.sourceType = .photoLibrary
            imagePicker.delegate = self
            present(imagePicker, animated: true, completion: nil)
        }
        
        func predict(image: UIImage) -> Animal? {
            do {
                if let resizedImage = resize(image: image, newSize: trainedImageSize), let pixelBuffer = resizedImage.toCVPixelBuffer() {
                    let prediction = try model.prediction(image: pixelBuffer)
                    let value = prediction.output[0].intValue
                    print(value)
                    
                    if value == 1{
                        return .dog
                    }
                    else{
                        return .cat
                    }
                }
            } catch {
                print("Error while doing predictions: \(error)")
            }
    
            return nil
        }
    
        func resize(image: UIImage, newSize: CGSize) -> UIImage? {
            UIGraphicsBeginImageContextWithOptions(newSize, false, 0.0)
            image.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height))
            let newImage = UIGraphicsGetImageFromCurrentImageContext()
            UIGraphicsEndImageContext()
            return newImage
        }
    
        func imagePickerControllerDidCancel(_ picker: UIImagePickerController) {
            dismiss(animated: true, completion: nil)
        }
    
    
        func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
            dismiss(animated: true) {
                if let image = info[UIImagePickerController.InfoKey.originalImage] as? UIImage {
                    let animal = self.predict(image: image)
                    
                    self.imageView.image = image
                    
    
                    if let animal = animal{
                        if animal == .dog{
                            self.modelOutputLabel.text = "Dog"
                        }
                        else if animal == .cat{
                            self.modelOutputLabel.text = "Cat"
                        }
                    }
                    else{
                        self.modelOutputLabel.text = "Neither dog nor cat."
                    }
                }
            }
        }
    }
    
    extension UIImage {
        func toCVPixelBuffer() -> CVPixelBuffer? {
            let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
            var pixelBuffer : CVPixelBuffer?
            let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(self.size.width), Int(self.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
            guard (status == kCVReturnSuccess) else {
                return nil
            }
    
            CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
            let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
    
            let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
            let context = CGContext(data: pixelData, width: Int(self.size.width), height: Int(self.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
    
            context?.translateBy(x: 0, y: self.size.height)
            context?.scaleBy(x: 1.0, y: -1.0)
    
            UIGraphicsPushContext(context!)
            self.draw(in: CGRect(x: 0, y: 0, width: self.size.width, height: self.size.height))
            UIGraphicsPopContext()
            CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
    
            return pixelBuffer
        }
    }
    
    
    

    In the above code, we resize the image to the size of the input for our model.
    0 denotes cat and 1 denotes dog (during classification, the numbers are assigned alphabetically).

    The output of the above application in action is given below:

    ios-coreml-dog-cat-classifier

    The full source of the iOS Application and the Keras models, data sets are available here

    Leave a Reply

    Your email address will not be published. Required fields are marked *