Data Science

As Data Scientists, we have been given a task to provide a robot the understanding of what a red hat (a fedora) looks like, correctly identify it and act on the object detection. You will be fine tuning an object dection model from the YOLO-family (You Only Look Once), on a set of fedora images publicly available. To achieve this we will use OpenShift AI, an extension to OpenShift that provides a platform for Data Scientists to work with their favourite tools and libraries

Instantiate Workbench

Let’s login into OpenShift AI

  • Login into the OpenShift Console with:

    • username:

      team-1
    • password :

      secret-password
  • Click Skip tour

    skip tour
  • Click on the top right menu with the squares and select OpenShift AI to access the OpenShift AI Dashboard

    rhoai menu
  • Click on Login with OpenShift

  • Login with the same credentials as before

This is the main Console where all Data Science related projects and assets are managed.

  • In the left menu click on Data Science Projects to see all available Data Science Projects

  • A project has already been prepared for you to work in

  • Click on the project team-1-ai

As Data Scientists, we are in need of an interactive developement environment(IDE) to do our programming. One of the most common one (and provided by OpenShift AI) is JupyterLab. Let’s create a new JupyterLab Workbench in our Data Science project and get started:

  • Click on Create Workbench

  • Enter these fields :

    • Name:

      fedora-detection
    • Notebook image

      • Image selection: Object Detection

    • Deployment size

      • Container size: 'Small'

    • Connections

      • Click on Attach existing connections

        • Select workbench-bucket connection

          attach existing connection
        • Click on Attach

  • Click on Create Workbench

    Wait until the Status of the Workbench is changed from "Starting" to "Running", this can take 2-3 minutes.

    workbench running

Create your Model in JupyterLab

  • Click on the Open link, next to the Stop link in the fedora-detection Workbench line to open the JuypterLab Workbench

  • Login with you OpenShift credentials and allow selected permissions

The first thing we notice when logging into our workbench is that it’s quite empty. If this is your first time seeing JupyterLab, take a couple of minutes to look around! But don’t worry, we are experienced Data Scientists who know the importance of documentation, version handling and reproducability. Therefore, in JuypterLab we are going to clone a complete JuypterNotebook to get started with our object detection project:

  • In the menu on the left, click on the Git icon

  • Click on Clone a Repository

  • Enter the Git Url to the notebooks repo that has been prepared for you:

    https://gitea.apps.example.com/team-1/object-detection-notebooks.git
  • Click on Clone

    workbench clone
  • On the left side in the menu click on the Folder icon.

Next, we will train our model on the basis of a Yolo5 Image Detection model to identify fedoras by providing sample images. The whole training process will run in a Pipeline leveraging OpenShift resource scaling. Sample images will be downloaded automatically and after the training, the model will be exported in onnx format to your S3 ODF (OpenShift Data Foundation) bucket. In this format we can deploy it to an inferencing service on OpenShift AI and serve it to our application.

Model Training

When training the model the confidence score and accuracy will be a limiting factor going forward. In order to stay within hardware restrictions such as no GPUs in the data center and the physical robot, a compromise has been made in terms of the size of the base model as well as training epochs, batch size, number of images to train upon and how much time will be left on the hackathon for other tasks. Of course, these variables may be changed by you attendees, however, the changes such as increased epoch and increased sample count might yield little to no improvement while taking longer time to train.

This is a limiting factor edge cases face, finding the balance between accuracy and sizing requirements for the model to run.

Therefore, have some realistic expectations on the model and dont get hooked up on fine tuning it.

If we accept this, lets go ahead and train the model!

  • In JupyterLab navigate to the directory object-detection-notebooks/model-training.

Notice that we now have a couple of python scripts containing code to execute the individual steps of the pipeline, a configuration.yaml file as well as pipeline definition itself. By clicking on the different scripts, you can view and edit them in your IDE. However, we are specifically interested in the pipeline definition, so let’s open it:

  • Double click on model-training-cpu.pipeline

Have a look at the pipeline steps:

  • Step 1 : Downloading a set of sample images with labels and coordinates for our fedora class from the OpenImages website.

    open images
If you want to see the collection the model training is pulling images from, go to above website, click "Explore", set the Type to Detection and the Class to Fedora.
  • Step 2 : Preparing the class labels and training sets

  • Step 3 : Running the actual training on a Yolo 5 Model

  • Step 4 : Converting the model format to onnx

  • Step 5 : Uploading it to an ODF S3 bucket

The pipeline can be configured to download and train on specifc image object class (e.g. Fedoras).

  • To configure a class, open the file called configuration.yaml.

You will see that an image class is already defined ('Laptop',). Looking for a new Laptop is great but we want to find red hats today.

  • Change the names array to look like this

names: ['Fedora',]
  • Save the file by pressing ctrl+s on Linux/Windows or Command+s on Mac

This image class is now mapped to the class number 0.

  • Now back in the model-training-cpu.pipeline, on the top menu on the left click on the play icon

    start pipeline
  • Keep the default settings and click on OK

  • Click on OK at the Job submission to Data Science Pipelines succeeded dialog

This will submit the pipeline to OpenShift to run the training

  • Switch to the OpenShift AI tab in your browser

  • Select your Data Science Project team-1-ai

  • Select Pipelines tab

  • Expand the model-training-cpu Pipeline by clicking on the >

  • Click on View runs

    view runs
  • Click on model-training-cpu-xxxxx at the Run column

    view runs2
  • Click on the currently running pipeline

This will show the running steps of the pipeline

running pipeline

With the default settings, the Pipeline will run around 15 minutes. Let’s use the time to deploy another Workbench that we can use to inspect our S3 bucket and see our model when ready.

Deploy S3 Browser

  • In the left menu click on Data Science Projects to see all available Data Science Projects

  • A project has already been prepared for you to work in

  • Click on the project team-1-ai

  • In your project go on the tab Workbenches

  • Click on Create workbench and enter these values

    • Name:

      s3-browser
    • Notebook image

      • Image Selection: S3 Browser

    • Connections

      • Click on Attach existing connections

        • Select workbench-bucket connection

          attach existing connection
        • Click on Attach

  • Click on Create Workbench

    Wait until the Status of the Workbench is changed from "Starting" to "Running", this can take 2-3 minutes.

    workbench running s3
  • Click on the Open link, next to the Stop link in the fedora-detection Workbench line to open the JuypterLab Workbench

  • Login with your OpenShift credentials and allow selected permissions

  • Allow selected permissions

  • Accept the disclaimer

The browser will show you the contents of your bucket. Except for a folder called backup it should be pretty empty at the moment.

Now is a good time to grab some coffee, or if you are curious read up on the architecture and requirements of the Yolov5 model family. There are different sizing versions of the Yolov5 and compute requirements. In the pipeline start form you could actually change the model version, and while the pipeline is at the model training step, you can see the loss functions in the logs.

Once the pipeline has run (Check the run) successfully the final model named latest-version.onnx will be saved in your S3 bucket. Have look in your S3 Browser. You should see a folder models with you models.

  • Click on models and you see

    s3 browser

Model Serving

You now have a trained model for object recognition. To use the model we will deploy it into OpenShift AI Model Serving, which will make it available via an API.

Model Runtime

First we need to configure a model server:

  • Click on Data Science Projects in the main menu on the left and make sure you have selected your team-1-ai again

  • Under the section Models click on Add model server

  • Model server name :

    ovms
  • Serving runtime : OpenVINO Model Server

  • Make deployed models available …​ : Check

  • Require token authentication : Check

    • Service account name : default-name

  • Keep the rest of the settings as is

  • Click Add

    serving runtime

Deploy Model

  • Click Go to Models next to your just created model server

  • Click Deploy model

  • In the form enter

    • Model deployment name:

      fedora-detection-service
    • Model framework (name-version): onnx-1

    • Existing data connection: workbench-bucket

    • Path:

      models/model-latest.onnx
    • Click Deploy

Wait for the server to start. It may take a bit before the model server is able to answer requests. If you get an error in the following calls, just wait a few seconds and try again.

Model Testing

Now it’s time to finally test the model. If you remember previously, our model training was limited by several factors and there when testing the model out, the confidence score will vary between 0.3-0.9 depending on different factors such as distance, lighting and other.

  • Copy the inference endpoint URL that is published through an OpenShift Route (and save it somewhere)

    copy inference url
    copy inference url2
  • Copy the token of the endpoint

    copy token
  • Back in your JupyterLab Workbench in the object-detection-notebooks directory, open the online-scoring.ipynb notebook

  • Look for the variables prediction_url and token

    prediction_url = 'REPLACE_ME'
    token = 'REPLACE_ME'
  • Paste the inference endpoint URL and the token you copied before into the placeholders

  • Run the full notebook (the button with the two play icons in the top menu)

    run full notebook
  • Confirm to Restart the Kernel

You will see any identified classes with bounding boxes and confidence score at the end of the notebook.

You can test your model with different images in the sample-images folder. But even better you can upload your own images. Take some pictures with your laptop or smartphone of a fedora on the floor and upload them into the sample_images folder. OBS! Make sure you adjust the image name in image_path variable before running the notebook again AND that the format of the images you take have a kvadratic resolution. This can be adjusted in most smartphones today. If you dont use a kvadratic resolution, the printed bounding boxes may be painted in the wrong areas of the pcitures.

If you see multiple bounding boxes over your fedoras, that means you may need to filter out object detections with a lower score. By default the code filters out anything with a lower confidence score that 0.3. Search for the code conf_thres=0.3. Increase the threshold here by changing the value and rerun your Notebook.

That’s it. It is finally time to handoff your amazing AI Fedora Detection service to the dev team. Make a note and use the two values prediction_url and token in your app in the next chapter.

Expected outcome of this chapter

After this chapter you should know:

  • how to train and test an AI model in OpenShift AI

  • how to make your model available for inferencing using the model server

If anything is unclear about these takeaways, please talk to your friendly facilitators.