Train a Custom Object Detection Model using Mask RCNN
By Samden Lepcha in Data Science Computer Vision
July 1, 2019
A complete guide from installation and training to deploying a custom trained object detection model in a webapp.
Background
According to Wikipedia “A pothole is a depression in a road surface, usually asphalt pavement, where traffic has removed broken pieces of the pavement”. Edmonton the “self proclaimed pothole capital” in Alberta, Canada reportedly spends $4.8 million on 450,000 potholes annually, as of 2015. In India every year approximately 1100 lives are lost to accidents caused by potholes source. Ordinary citizens do not have the means of communicating or reporting the bad roads to the concerned authorities while the authorities lay unaware of the situation.
Therefore, several organizations have been trying to develop tools (like web apps) where the citizens can report the potholes to the concerned authorities. There are several hackathons that have taken place with this project in mind as one of the objectives. Seeing this as a growing concern, in this project to address this problem the aim is to develop a simple interface that uses the state of the art object detection technology to detect potholes in real time and report them using Google Maps. This article will take you through the steps required to build your very own pothole detection system. The deployment medium for this project will be on smartphones which are used by 500 million+ people in India according to Newzoo’s 2019 Global Mobile Market Report.
Tools Used:
- Python 3.6+
- Tensorflow Object Detection API
- Pixel Annotation Tool
- Anaconda Package Manager
- Flask
The workflow of the Project will be as follows:
- Environment Setup
- Dataset Gathering
- Model Training
- Deployment with Flask
- Results
Anaconda Environment Setup
In the beginning, we will set up a new Anaconda environment and install all the necessary packages required for this project. Anaconda is a popular python package manager alongside “pip”. If you have not installed prior to this project please install it using the below links.
It is a fairly straight forward installation and should not take long. You can install the Miniconda if you have some experience using the command line but if you want the GUI you can install the Anaconda Navigator with all the additional packages (this will take longer to install).
After this open “Anaconda Prompt” from your start menu and follow the rest of the installation instructions:
- Create the conda environment.
(base) C:\Users>conda create --name pothole python=3.6
- Activate the environment and upgrade pip.
(base) C:\Users>activate pothole
(pothole) C:\Users>python -m pip install --upgrade pip
- Install the other necessary packages by issuing the following commands:
(pothole) C:\Users>conda install -c anaconda protobuf
(pothole) C:\Users>pip install pillow
(pothole) C:\Users>pip install lxml
(pothole) C:\Users>pip install Cython
(pothole) C:\Users>pip install contextlib2
(pothole) C:\Users>pip install jupyter
(pothole) C:\Users>pip install matplotlib
(pothole) C:\Users>pip install opencv-python
(pothole) C:\Users>pip install labelme
(pothole) C:\Users>pip install tensorflow-gpu==1.15.2
- Clone or download the tensorflow object detection api repository from Github. For the purpose of this project, we are using tensorflow version 1.15.2. Note Always make sure the tensorflow version installed and the tensorflow object detection api repository version is the same. Run the following command or download this repository manually.
(pothole) C:\Users>git clone https://github.com/tensorflow/models.git
Place these folders in a folder called “models”. You can place this “models” folder in a directory of your choice.
- Configure the PYTHONPATH environment variable and install the COCO api:
(pothole) C:\Users>set PYTHONPATH=C:\models;C:\models\research;C:\models\research\slim
(pothole) C:\Users>pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
- Compile Protobufs and run setup.py
In the Anaconda Prompt change directories to \models\research directory
(pothole) C:\Users>cd C:\models\research
Run the following lines of code:
protoc --python_out=. .\object_detection\protos\anchor_generator.proto .\object_detection\protos\argmax_matcher.proto .\object_detection\protos\bipartite_matcher.proto .\object_detection\protos\box_coder.proto .\object_detection\protos\box_predictor.proto .\object_detection\protos\eval.proto .\object_detection\protos\faster_rcnn.proto .\object_detection\protos\faster_rcnn_box_coder.proto .\object_detection\protos\grid_anchor_generator.proto .\object_detection\protos\hyperparams.proto .\object_detection\protos\image_resizer.proto .\object_detection\protos\input_reader.proto .\object_detection\protos\losses.proto .\object_detection\protos\matcher.proto .\object_detection\protos\mean_stddev_box_coder.proto .\object_detection\protos\model.proto .\object_detection\protos\optimizer.proto .\object_detection\protos\pipeline.proto .\object_detection\protos\post_processing.proto .\object_detection\protos\preprocessor.proto .\object_detection\protos\region_similarity_calculator.proto .\object_detection\protos\square_box_coder.proto .\object_detection\protos\ssd.proto .\object_detection\protos\ssd_anchor_generator.proto .\object_detection\protos\string_int_label_map.proto .\object_detection\protos\train.proto .\object_detection\protos\keypoint_box_coder.proto .\object_detection\protos\multiscale_anchor_generator.proto .\object_detection\protos\graph_rewriter.proto .\object_detection\protos\calibration.proto .\object_detection\protos\flexible_grid_anchor_generator.proto
If it gives an error that the protobuf file could not be found run this after:
protoc object_detection/protos/*.proto --python_out=.
Finally, we need to run the following commands:
(pothole) C:\models\research> python setup.py build
(pothole) C:\models\research> python setup.py install
- You can test if everything is working out by running the IPython Notebook present in the object_detection folder called “object_detection_tutorial.ipynb”.
(pothole) C:\models\research>cd object_detection
(pothole) C:\models\research\object_detection>jupyter notebook object_detection_tutorial.ipynb
Dataset Gathering
As always, at the beginning of any Data Science or AI Project after the problem statement has been identified we move on to gathering the data or in this case images for training.
To train a robust model we need to use a lot of pictures but with variation as well. That means the potholes must be present at various angles or shapes so that our model does not lean on to one variation or in other words overfits and does not generalize for other images.
You can use the images that you have taken personally or download them from the Internet like me. For this project, the idea is to detect potholes so we would not be segmenting them out based on severity but that does leave something for the future scope as well. The following data sources were used for building this project:
We need to resize the images so that the model can be trained using these resized images like 800 x 600 in this project (Unless you have unlimited GPU compute power). Use either the command prompt or anaconda prompt or any other IDE to run this script. For example in Anaconda Prompt:
(pothole) C:\Users> python DatasetCreation.py
Data Labeling
Now that we have gathered the dataset we need to label the images so that the model understands what is a pothole. To label the images we need a labeling software.
For the purpose of the project, I have used labelme as it is fairly simple to use. In your anaconda environment type “labelme” and the software should open up like below.
(pothole) C:\Users>labelme
Open your image from your directory and click on Create Polygon and start labeling your images. Labelme saves your labels as json files with the same name as the image name. Place the json in the same directory as your images. An example of Labelme(right) along with Pixel Annotation Tool(left) is shown below. For this project I have labeled 400 images.
Model Training
- Create TFRecords:
After labeling our entire dataset we now have to generate TFRecords which serves as input for our model training. But before that we need to convert the json labelme labels into COCO format. I have taken the script provided by Gilber Tanner in his tutorial to perform this. You can also find this in my Github Repository labeled “labelme2coco.py”. Download this and place it onto the directory where your Train/ Test images are located. Now run the following commands:
(pothole) C:\Users\models\research\object_detection\images>python labelme2coco.py train --output train.json
(pothole) C:\Users\models\research\object_detection\images>python labelme2coco.py test --output test.json
Now that the train/test data is in the COCO format we can now create the TFRecords using the create_coco_tf_record.py also created by Gilber Tanner. This script also needs to be placed and run from the object_detection folder.
python create_coco_tf_record.py --logtostderr --train_image_dir=images/train --test_image_dir=images/test --train_annotations_file=images/train.json --test_annotations_file=images/test.json --include_masks=True --output_dir=./
You should find train.record and test.record in your object_detection folder.
- Creating Label Map
The label map links class names to ID numbers. Use a text editor like Sublime Text to create a “labelmap.pbtxt” and store it inside object_detection/training folder. Inside the folder write the following:
item {
id: 1
name: 'Pothole'
}
You can define as much as you want depending on the classes you want to detect but for the purpose of this project we are only interested in detecting potholes.
This id should match with the id mentioned in your train.json and test.json files. Notice how it one number greater i.e here it is id: 0 but we mention id:1 in the labelmap file.
"categories": [
{
"supercategory": "Pothole",
"id": 0,
"name": "Pothole"
},
],
- Creating Training Configuration File:
Now we need to create a training configuration file. From the tensorflow model zoo there are a variety of tensorflow models available for Mask RCNN but for the purpose of this project we are gonna use the `mask_rcnn_inception_v2_coco](download.tensorflow.org/models/object_detection/mask_rcnn_inception_v2_coco_2018_01_28.tar.gz) because of it’s speed. Download this and place it onto the object_detection folder. You can find the mask_rcnn_inception_v2_coco.config file inside the samples/config folder. Copy this folder and place it into object_detection/training folder. Now we have to make the following changes to this config file:
- Line 10: Change num_classes to the number of different objects you want the classifier to detect.(1 in this project’s case)
- Line 126: Change fine_tune_checkpoint to:
fine_tune_checkpoint: "<path>/models/research/object_detection/mask_rcnn_inception_v2_coco_2018_01_28/model.ckpt"
- Line 142: Change input_path to the path of the train.records file:
input_path: "<path>/models/research/object_detection/train.record"
- Line 158: Change input_path to the path of the test.records file:
input_path: "<path>/models/research/object_detection/test.record"
- Line 144 and 160: Change label_map_path to the path of the label map:
label_map_path: "<path>/models/research/object_detection/training/labelmap.pbtxt"
- Line 150: change num_example to the number of images in your test folder.
- Training the Model:
Run the following command to start the training of the model from the object_detection folder:
python legacy/train.py --train_dir=training --pipeline_config_path=training/mask_rcnn_inception_v2_coco.config
After every interval the model saves the checkpoints in the training folder. It is a good idea to let it train till the loss is below 0.05. The time taken will depend on how powerful your GPU is.
You can view the progress of your model by opening another Anaconda Prompt Window and changing the directory to the object_detection folder and typing the following command:
(pothole) C:\models\research\object_detection>tensorboard --logdir=training
This will create a webpage on your local machine YourPCName:6006, which can be viewed through a web browser. The TensorBoard page provides information and graphs that show how the training is progressing.
You can stop the training by pressing Ctrl+C while in the command prompt window. I recommend stopping after it has created the checkpoint in your folder this usually is done every 5–10 mins depending on your compute power. The checkpoint at the highest number of steps will be used to generate the frozen inference graph.
- Exporting Inference Graph
Create a folder called “inference_graph” inside object_detection folder. Now we can create the frozen inference graph(.pb file) inside this folder. To do this issue the following command:
python export_inference_graph.py --input_type=image_tensor --pipeline_config_path=training/mask_rcnn_inception_v2_coco.config --trained_checkpoint_prefix=training/model.ckpt-2194 --output_directory=inference_graph
This frozen inference graph is the object detection classifier.
- Testing the newly trained classifier
To test the newly trained classifer you can make changes to the already existing object_detection.ipynb file present in my Github Repo.
Change the directory location for the labelmap, inference_graph, .config file and the test_images directory based on your location. You should get the following output:
Deploying with Flask
Flask is a micro web framework written in Python developed by Armin Ronacher. We are going to use Flask to deploy our custom trained object detection model. You can find the beginner tutorial on their official documentation.
We are going to be using the code present in the object_detection.ipynb file in our Flask app. The code is called “app.py” which is also present in my Github repository. In the beginning our app.py file we import our libraries and append our Python Path where the object detection api is located. Change this according to the location you have placed this file.
The simple architecture of the Flask App can be described using the image below.
We take the image as input to the Custom Trained Mask RCNN model which based on the accuracy score then decides whether to give the coordinates or not. You can run the “app.py” by running the below command.
python app.py
After running the above command we should get the below output. Copy this onto your browser for the web application to render the HTML pages. I have made a terrible job of this. You guys can create better interfaces or a better UI for this project by messing around with the HTML and CSS files. You can find all the output images below in the results section.
Results
This section just contains the various output images of the project.
- This is the first page after copying the URL from Anaconda Prompt onto your browser of your choice.
- This is the page after selecting and uploading an image of your choice.
- This is the page after clicking on the submit button. Notice how the button below appears only when the score is above is 50%.
After clicking on the button below the output result that states to get the current position. I have zoomed out the map quite a bit to not reveal my location but you can get really precise and zoomed in coordinates. You can try to set up an architecture where you maintain a location database online so that the page can display those coordinates but for the purpose of this project we are just displaying the current location where the image was uploaded. So the image has to be taken and uploaded at the same spot.
Thank you for reading to the end of this article. That is it for this tutorial. I hope you liked this article and that it helps in your Data Science journey. You can find more such articles on the blog section of my website.
References
- Over 9300 deaths, 25000 injured in 3 years due to potholes — India Today
- Nienaber, S & Booysen, M.J. (Thinus) & Kroon, RS. (2015). Dataset of images used for pothole detection. 10.13140/RG.2.1.3646.1520
- How To Train an Object Detection Classifier for Multiple Objects Using TensorFlow (GPU) on Windows 10 — Github
- Custom Mask RCNN using Tensorflow Object Detection API — Medium
- Train a Mask R-CNN model with the Tensorflow Object Detection API — Gilbert Tanner
- HTML Geolocation API — w3schools
- Posted on:
- July 1, 2019
- Length:
- 11 minute read, 2290 words
- Categories:
- Data Science Computer Vision
- Tags:
- Flask TensorFlow Python