Introduction

This article describes how to to use Microsoft Azure’s Cognitive Services Face API and python to identify, count and classify people in a picture. In addition, it will show how to use the service to compare two face images and tell if they are the same person. We will try it out with several celebrity look-alikes to see if the algorithm can tell the difference between two similar Hollywood actors. By the end of the article, you should be able to use these examples to further explore Azure’s Cognitive Services with python and incorporate them in your own projects.

What are Cognitive Services?

The basic idea between Azure’s Cognitive Services is that Microsoft has done a lot of the heavy lifting to build and deploy AI models for specific tasks. There is no need to understand what technology is used behind the scenes because the Cognitive Services APIs provide a relatively simple way to use this already trained AI framework for your own problems. All that is required is setting up an account and using the REST API to process your data. Since I have not done much work with python’s native vision libraries, I thought I would explore using the Face API to get a sense for what types of tasks it might be suited for.

At a high level, we can use the Face API to determine many elements of a person’s face in picture, including:

  • Number of faces and where they are in the picture
  • Traits of the faces such as whether or not the person is wearing glasses, has makeup or facial hair.
  • What emotion does the face convey (such as anger, contempt, disgust, fear, happiness, neutral, sadness or surprise)?
  • Identify individuals and determine if two different pictures are of the same person

In other words, there is a lot of power in this API and it can be easily accessed with python.

Setting up your account

In order to get started, you do need to have an active Azure account and enable Cognitive Services for the account.

If you do not already have one, create an Azure account or log in to your existing one. This is a paid service but new users can get a free trial. In addition, your company or education institution might already be using Azure so be sure to check what options are available.

Once your Azure account is active, create a Cognitive Services account following the steps in the Microsoft documentation.

Once you are done, you need two key pieces of information:

  • the API endpoint
  • your key

The API end point will be based on the location you choose. For me, the endpoint is: https://northcentralus.api.cognitive.microsoft.com/ and keys will look something like this: 9a1111e22294eb1bb9999a4a66e07b41 (not my actual key)

Here is where to find it in the Azure portal:

Azure Credentials

Now that everything is setup with Azure, we can try to run a quick test to see if it works.

Testing the process

The Cognitive Services documentation is really good, so much of this article is based off the examples in the Python API quickstart.

Before going too much further, I want to cover one topic about determining how to access these services. Microsoft has exposed these services through a REST API which can be used by pretty much any language. They have also created a python SDK which provides a handy wrapper around the REST API and also includes some convenience functions for dealing with images and handling errors more gracefully. My recommendation is to experiment with the REST API to understand how the process works. If you do build production code, you should evaluate using the SDK because of the convenience and more robust error handling.

I have created a streamlined notebook that you can download and follow along with. The step by step directions below are meant to augment the notebook.

Fire up your own jupyter notebook and get the following imports in place:

from pathlib import Path
from urllib.parse import urlparse
import requests
import json
from PIL import Image
from io import BytesIO
from matplotlib import patches
import matplotlib.pyplot as plt

%matplotlib inline

You don’t strictly need all of these imports but I am going to make some helper functions to make it easier to display and work with the images. That’s the main reason I’m including all the extra imports.

Next, make sure to assign your API key and appropriate endpoint API url. You must use your own key and endpoint. These values will not work if you just copy and paste:

subscription_key = '9a1111e22294eb1bb9999a4a66e07b41'
face_api_url = 'https://northcentralus.api.cognitive.microsoft.com/face/v1.0/detect'
face_api_url_verify = 'https://northcentralus.api.cognitive.microsoft.com/face/v1.0/verify'

One point to note with the url, is that the endpoint is https://northcentralus.api.cognitive.microsoft.com/ but the actual url needs to include the API information, in this case, /face/v1.0/detect

I am also defining the verify url endpoint which we will use a little bit later.

Now that everything is setup, we can use the requests module to post some information to our endpoint and see what the API responds with:

image_url = 'https://amp.insider.com/images/5a9878b3267894f3058b4676-640-480.jpg'
headers = {'Ocp-Apim-Subscription-Key': subscription_key}

params = {
    'returnFaceId':
    'true',
    'returnFaceLandmarks':
    'false',
    'returnFaceAttributes':
    'age,gender,headPose,smile,facialHair,glasses,emotion,hair,makeup,occlusion,accessories,blur,exposure,noise',
}

response = requests.post(face_api_url,
                        params=params,
                        headers=headers,
                        json={"url": image_url})

They key function of this code is to pass:

  • a valid url of an image
  • our credentials (key + endpoint)
  • parameters to control the output

In return, we get a nested json response back. If we call response.json() we get something that looks like this:

[{'faceId': '6e750a8f-9a55-4b03-a9ce-b79d5cb93740',
'faceRectangle': {'top': 99, 'left': 410, 'width': 125, 'height': 125},
'faceAttributes': {'smile': 0.012,
'headPose': {'pitch': -5.1, 'roll': 3.4, 'yaw': -3.5},
'gender': 'male',
'age': 30.0,
'facialHair': {'moustache': 0.1, 'beard': 0.1, 'sideburns': 0.1},
'glasses': 'NoGlasses',
'emotion': {'anger': 0.0,
    'contempt': 0.075,
    'disgust': 0.0,
    'fear': 0.0,
    'happiness': 0.012,
    'neutral': 0.913,
    'sadness': 0.0,
    'surprise': 0.0},
'blur': {'blurLevel': 'medium', 'value': 0.58},
'exposure': {'exposureLevel': 'goodExposure', 'value': 0.7},
'noise': {'noiseLevel': 'medium', 'value': 0.48},
'makeup': {'eyeMakeup': True, 'lipMakeup': False},
'accessories': [],
'occlusion': {'foreheadOccluded': False,
    'eyeOccluded': False,
    'mouthOccluded': False},
'hair': {'bald': 0.02,
    'invisible': False,
    'hairColor': [{'color': 'brown', 'confidence': 1.0},
    {'color': 'red', 'confidence': 0.59},
    {'color': 'blond', 'confidence': 0.27},
    {'color': 'black', 'confidence': 0.17},
    {'color': 'gray', 'confidence': 0.05},
    {'color': 'other', 'confidence': 0.01}]}}},
{'faceId': '9bdb3a49-1c79-459c-ba11-79ac12517739',
'faceRectangle': {'top': 179, 'left': 105, 'width': 112, 'height': 112},
'faceAttributes': {'smile': 0.823,
'headPose': {'pitch': -5.8, 'roll': 0.2, 'yaw': -3.2},
'gender': 'female',
'age': 32.0,
'facialHair': {'moustache': 0.0, 'beard': 0.0, 'sideburns': 0.0},
'glasses': 'NoGlasses',
'emotion': {'anger': 0.0,
    'contempt': 0.0,
    'disgust': 0.0,
    'fear': 0.0,
    'happiness': 0.823,
    'neutral': 0.176,
    'sadness': 0.0,
    'surprise': 0.0},
'blur': {'blurLevel': 'medium', 'value': 0.34},
'exposure': {'exposureLevel': 'goodExposure', 'value': 0.63},
'noise': {'noiseLevel': 'low', 'value': 0.1},
'makeup': {'eyeMakeup': True, 'lipMakeup': True},
'accessories': [],
'occlusion': {'foreheadOccluded': False,
    'eyeOccluded': False,
    'mouthOccluded': False},
'hair': {'bald': 0.01,
    'invisible': False,
    'hairColor': [{'color': 'brown', 'confidence': 1.0},
    {'color': 'blond', 'confidence': 0.66},
    {'color': 'red', 'confidence': 0.61},
    {'color': 'black', 'confidence': 0.09},
    {'color': 'gray', 'confidence': 0.07},
    {'color': 'other', 'confidence': 0.01}]}}}]

In this case, the image contained two people so there are two faceID attributes.

The faceIDs are important because they are uniquely generated, tied only to our account and stored for 24 hours. We can use this ID to determine if two faces are equivalent. A little later in this article, I will show an example.

If you want to know the number of people detected in the image, look at the length of the result:

print(len(response.json()))

In addition, you can see that the analysis thinks there is 1 male aged 30 and 1 female aged 32. The male has a “neutral” emotion and the female has a “happiness” emotion. Interestingly, the algorithm “thinks” there is eye makeup on both faces.

This is all very interesting but there are two challenges. First, it would be nice to see an image marked up with the faces and also it would be nice to run this on local images as well as remote urls.

Fortunately the demo jupyter notebook gives us a really good head start. I am going to leverage that code to build an improved image display function that will:

  • Work on local files or remote urls
  • Return the json data
  • Give us the option to display a portion of the faceID on the image to make it easier for future analysis

In order to get this code to work on a local file, we need to change our function call in two ways. First, the header must have a content type of 'application/octet-stream' and we must pass the image_data via the data parameter.

Here is what the call will look like for a sample image on the local computer:

headers = {'Ocp-Apim-Subscription-Key': subscription_key,
           'Content-Type': 'application/octet-stream'}

image_data = open('Sample_local_image.jpg', 'rb').read()
response = requests.post(face_api_url, headers=headers, params=params, data=image_data)

In order to streamline this process and annotate images, I’ve created an updated annotate_image() function that can parse a local file or pass a remote URL, then show where the algorithm thinks the faces are:

Here is the full function:

def annotate_image(image_url, subscription_key, api_url, show_face_id=False):
    """ Helper function for Microsoft Azure face detector.

 Args:
 image_url: Can be a remote http:// or file:// url pointing to an image less then 10MB
 subscription_key: Cognitive services generated key
 api_url: API end point from Cognitive services
 show_face_id: If True, display the first 6 characters of the faceID

 Returns:
 figure: matplotlib figure that contains the image and boxes around the faces with their age and gender
 json response: Full json data returned from the API call

 """

    # The default header must include the sunbscription key
    headers = {'Ocp-Apim-Subscription-Key': subscription_key}

    params = {
        'returnFaceId': 'true',
        'returnFaceLandmarks': 'false',
        'returnFaceAttributes': 'age,gender,headPose,smile,facialHair,glasses,emotion,hair,makeup,occlusion,accessories,blur,exposure,noise',
    }

    # Figure out if this is a local file or url
    parsed_url = urlparse(image_url)
    if parsed_url.scheme == 'file':
        image_data = open(parsed_url.path, "rb").read()

        # When making the request, we need to add a Content-Type Header
        # and pass data instead of a url
        headers['Content-Type']='application/octet-stream'
        response = requests.post(api_url, params=params, headers=headers, data=image_data)

        # Open up the image for plotting
        image = Image.open(parsed_url.path)
    else:
        # Pass in the URL to the API
        response = requests.post(api_url, params=params, headers=headers, json={"url": image_url})
        image_file = BytesIO(requests.get(image_url).content)
        image = Image.open(image_file)

    faces = response.json()

    fig, ax = plt.subplots(figsize=(10,10))

    ax.imshow(image, alpha=0.6)
    for face in faces:
        fr = face["faceRectangle"]
        fa = face["faceAttributes"]
        origin = (fr["left"], fr["top"])
        p = patches.Rectangle(origin, fr["width"],
                            fr["height"], fill=False, linewidth=2, color='b')
        ax.axes.add_patch(p)
        ax.text(origin[0], origin[1], "%s, %d"%(fa["gender"].capitalize(), fa["age"]),
                fontsize=16, weight="bold", va="bottom")

        if show_face_id:
            ax.text(origin[0], origin[1]+fr["height"], "%s"%(face["faceId"][:5]),
            fontsize=12, va="bottom")
    ax.axis("off")

    # Explicitly closing image so it does not show in the notebook
    plt.close()
    return fig, faces

Here’s how we it works:

labeled_image, response_1 = annotate_image(
    'https://amp.insider.com/images/5a9878b3267894f3058b4676-640-480.jpg',
    subscription_key,
    face_api_url,
    show_face_id=True)

labeled_image
Pam and Jim

If you want to call on a local file, use a file url that looks like this:

labeled_image, response_data = annotate_image(
    "file:///home/chris/Pictures/P1120573.JPG", subscription_key,
    face_api_url)

Going back to the Pam and Jim example, you can view the json response like this:

print(response_1[0]['faceId'], response_1[0]['faceAttributes']['emotion'])
6e750a8f-9a55-4b03-a9ce-b79d5cb93740 {'anger': 0.0, 'contempt': 0.075, 'disgust': 0.0, 'fear': 0.0, 'happiness': 0.012, 'neutral': 0.913, 'sadness': 0.0, 'surprise': 0.0}

You’ll notice that the prefix for the faceId is shown in the image so it make the entire analysis process a little bit easier when developing your own solution.

Celebrity Look-Alikes

In addition to showing the actual face information, we can use the Verify Face API to check if two faces are of the same person. This should work regardless of age, facial hair, makeup, glasses or other superficial changes. In my opinion, this shows the significant advances that have been made in image processing over the past few years. We now have the power to quickly and easily analyze images with a simple API call. Pretty impressive.

In order to simplify the process, I created a small function to take two faceIDs and see if they are the same:

def face_compare(id_1, id_2, api_url):
    """ Determine if two faceIDs are for the same person
 Args:
 id_1: faceID for person 1
 id_2: faceID for person 2
 api_url: API end point from Cognitive services
 show_face_id: If True, display the first 6 characters of the faceID

 Returns:
 json response: Full json data returned from the API call

 """
    headers = {
        'Content-Type': 'application/json',
        'Ocp-Apim-Subscription-Key': subscription_key
    }

    body = {"faceId1": id_1, "faceId2": id_2}

    params = {}
    response = requests.post(api_url,
                            params=params,
                            headers=headers,
                            json=body)
    return response.json()

Since we have a picture of a young Jim, let’s see if it’s the same Jim (aka John Krasinski) with a beard. We can annotate this new image and inspect the json results to get the faceID of the second image:

john_k_2 = 'https://img.webmd.com/dtmcms/live/webmd/consumer_assets/site_images/article_thumbnails/magazine/2018/05_2018/john_krasinski_magazine/650x350_john_krasinski_magazine.jpg'
labeled_image, response_2 = annotate_image(john_k_2,
                                           subscription_key,
                                           face_api_url,
                                           show_face_id=True)
Jim with beard

Now we can compare the two faceID’s to see if they are truly the same people:

face_compare(response_2[0]['faceId'], response_1[0]['faceId'], face_api_url_verify)
{'isIdentical': True, 'confidence': 0.63733}

Very cool. The API identified that this was the same person with a 63.7% confidence.

We can have a little fun with this and use this to see if the computer can tell two people apart that look very similar. For instance, can we tell Zooey Deschanel apart from Katy Perry?

zooey_katy = 'https://www.nydailynews.com/resizer/vboKUbzNIwhFRFfr-jGqZlmx0Ws=/800x597/top/arc-anglerfish-arc2-prod-tronc.s3.amazonaws.com/public/VE7PI5PUDWW2BTS7NYR5OWEL3A.jpg'
labeled_image_z_k, response_3 = annotate_image(
                                zooey_katy, subscription_key, face_api_url)
Zooey and Katy

They are very similar. Let’s see what Cognitive Services thinks:

face_compare(response_3[0]['faceId'], response_3[1]['faceId'],
            face_api_url_verify)
{'isIdentical': False, 'confidence': 0.09186}

Ok. It’s close but they are not the same - according to the algorithm.

Let’s try one more that is even more difficult. Rob Lowe and Ian Somerhalder are another pair that frequently show up on celebrity look-alike lists.

rob_lowe = 'http://cdn.ppcorn.com/wp-content/uploads/sites/14/2015/08/rob-ian-ppcorn-760x500.jpg'
labeled_image_rob, response_4 = annotate_image(rob_lowe, subscription_key,
                                               face_api_url)
Rob Lowe
face_compare(response_4[0]['faceId'], response_4[1]['faceId'],
                face_api_url_verify)
{'isIdentical': True, 'confidence': 0.50762}

Woah! I guess Rob Lowe and Ian Somerhalder even confuse the AI!

Limitations

In my limited testing, the algorithm works pretty well. The processing works best when the faces are looking directly at the camera and there is good lighting and contrast. In addition, the files must be less then 10MB in size and the maximum number of faces it can identify is 100.

Here’s a group example:

friends_url = 'https://pmctvline2.files.wordpress.com/2019/03/friends-revival.jpg'
labeled_image, response_5 = annotate_image(friends_url, subscription_key,
                                        face_api_url)
print(f'{len(response_5)} People in this picture')
6 People in this picture
Friends image

Which works pretty well.

However, this attempt only found two faces:

Office Images

There are additional detection models available which might perform better in this scenario. If you are interested in pursuing further, I would recommend taking a look at their performance to see if it is improved in this scenario.

Despite these types of challenges, it is very impressive how far the computer vision field has come and how much capability is made available through these solutions.

Summary

Despite the somewhat click bait headline, I do think this is a real useful capability. We have gotten used to google and facebook being able to identify images in pictures so this is a feature we need to understand more. While there are security and privacy concerns with this technology; I think there are still valid use cases where this technology can be very beneficial in a business context.

The Cognitive Services API provides additional features that I did not have time to cover in the article but this should give you a good start for future analysis. In addition, the capabilities are continually being refined so it is worth keeping an eye on it and seeing how these services change over time.

This article was a bit of a departure from my standard articles but I will admit it was a really fun topic to explore. Please comment below if you find this helpful and are interested in other similar topics.