Brief on YOLO
You only look once (YOLO) is a state-of-the-art, real-time object detection algorithm. On a Pascal Titan X it processes images at 30 FPS and has a mAP of 57.9% on COCO test-dev. It is fast and accurate and an easy trade-off is achieved between speed and accuracy based on size of the model which can be changed accordingly.
For feature extraction YOLO version 3 uses a 53 layered convolutional neural network architecture with successive 3 x 3 and 1 x 1 convolutional layers.For detection purposes, 53 more layers are added to it, thus giving a 106 layer convolutional architecture for YOLO v3. The network architecture framework is called Darknet-53.
How YOLO works
Unlike most of the previous detection algorithms which apply a model to an image at multiple locations and high scoring regions are considered as detection, YOLO uses a completely different approach.
It applies a single neural network to the entire image, which divides the image into multiple regions and predictions are done using bounding boxes and probabilities based on each region.Since YOLO uses a single network for evaluation of the entire image,this makes it faster than most of the other algorithms like R-CNN and Fast R-CNN.
So, Lets jump on to building our fire detection model without wasting much time.
Data Collection
We will start with the image collection to build our dataset on which our model would be trained. Provided below are some links to datasets containing containing fire images and videos.
The dataset which I used for training consisted of images from above mentioned links along with images collected from google and frames extracted from videos of fire mishaps.I will be releasing it soon on kaggle. For now, you can contact me at my email provided, for the gdrive link to the dataset along with annotations.
Data Annotation
If you have chosen to train the model using your own custom dataset, you need to perform annotations on the collected images, which is basically labeling the images with their class names so that they can be fed into the model to train it.You can use available tools like LabelImg, Labelbox, VGG Image Annotator(VIA),BBox-Label_tool for annotating images in the dataset
The image annotations will be text files having same name as that of the image in the following format containing class number, object coordinates, and image height and width. Object coordinates will be normalized between o and 1 and basically denotes the position of the bounded object w.r.t. the image height and width.
<object-class> <x> <y> <width> <height>
Preprocessing
We need to create four files which shall be used by our model, namely, Train.txt, Test.txt, Classes.names, Trainer.data and along with that we need to have the YOLOv3.cfg and darknet53.conv.74 which is the yolo pretrained weights file upon which we shall perform transfer learning to to train our model to detect custom classes.
The train and test file should have the addresses(use fully specified addresses to avoid errors) of all the images in the dataset , each address separated by a newline. You can use a 80:20 ratio to divide the dataset into train and test images and form the Train.txt and Test.txt file accordingly. Note: Keep in mind that we shall be using google colab for training and you will need to upload all your files to google drive, hence set the addresses accordingly.
The Classes.names file contains the names of the classes that our model shall be detecting. In our case the file needs to contain only one class i.e., fire. Note that if you are using windows please make sure that the file is built with a .names extension.
The Trainer.data file is a DATA file containing the following info :-
classes= 1
train = address/to/directory/containing/Train.txt
valid = address/to/directory/containing/Test.txt
names = address/to/directory/containing/Classes.names
backup = backup
We need to make a few changes in the config file in order to use it in our model. You need to change the number of classes to 1(Line 610, 696, 783) and the number of filters to 18(Line 603, 689, 776)
(filters = (classes + 5)*3)
Setting up the colab workspace and Training
We'll be using Google Colab to build and train our model since we would require GPU support for training purpose.Using a local machine might take days to complete the training where using colab can complete the same in hours.
Having created and uploaded all the files that are going to be required to train the model on Google drive, lets begin with the training.
Open a new notebook on Google colab and connect to a runtime. Make sure that you have changed the notebook settings in order to allocate a GPU to train the model.Having done that begin with mounting the drive to colab.
from google.colab import drive
drive.mount('/content/gdrive')
Check for cuda installed version
!/usr/local/cuda/bin/nvcc --version
Download the appropriate cuDNN files and Unzip the cuDNN files from Drive folder directly to the VM CUDA folders(needs to be done only once)
!tar-xzvfgdrive/My\ Drive/Fire_detection/cuDNN/cudnn-10.1-linux-x64-v7.6.5.32.tgz-C/usr/local/
!chmoda+r/usr/local/cuda/include/cudnn.h
Check for proper installation. Can be commented out on future runs
!cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
Clone Darknet
!gitclonehttps://github.com/kriyeng/darknet/
!ls
!gitcheckoutfeature/google-colab
Compile Darknet
%cd/content/darknet
!sed-i's/OPENCV=0/OPENCV=1/'Makefile
!sed-i's/GPU=0/GPU=1/'Makefile
!sed-i's/CUDNN=0/CUDNN=1/'Makefile
YOLO weights file download(optional)
#!wget https://pjreddie.com/media/files/yolov3.weights
Copy yolov3 weights from drive to VM Folder(optional, just for checking darknet installation)
!cp /content/gdrive/My\ Drive/Fire_detection/darknet/yolov3.weights /content/darknet
Here are some useful functions that you will need in future to display, download and upload images.
def imShow(path):
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
image = cv2.imread(path)
height, width = image.shape[:2]
resized_image = cv2.resize(image,(3*width, 3*height), interpolation = cv2.INTER_CUBIC)
fig = plt.gcf()
fig.set_size_inches(18, 10)
plt.axis("off")
plt.imshow(cv2.cvtColor(resized_image, cv2.COLOR_BGR2RGB))
plt.show()
def upload():
from google.colab import files
uploaded = files.upload()
for name, data in uploaded.items():
with open(name, 'wb') as f:
f.write(data)
print ('saved file', name)
def download(path):
from google.colab import files
files.download(path)
Test Darknet Installation(Optional)
!./darknetdetectcfg/yolov3.cfgyolov3.weightsdata/person.jpg-dont-show
imShow('predictions.jpg')
If darknet was installed successfully, you would get to see something like this:-
Note that if you are using windows as your default operating system you need to convert the Train.txt, Test.txt, Classes.names and Trainer.data files into Unix format
View entire code and notebook by visiting my Github repo and star it if you find it helpful.
Results
Deployment
I have created a web application based on the model using streamlit, which takes an image or video as input and processes it to detect fire. The model has been deployed in Heroku server. Visit https://fireapp-aicoe.herokuapp.com/ to view and test the working model(Kindly be patient as it may take some time to load).
Challenges faced
■The data collection and annotation process was time consuming.
■Fire doesn’t have any specified shape or size and even at times, varies in color, Hence achieving a high accuracy in its detection is a difficult task.
■We first tried training on our local machines which took a lot of time and didn’t give much accuracy on training over less number of iterations and finally shifted to colab finding no other way.
■Streamlit is a new framework and currently doesn’t have support for many python libraries.
Resources
Brought to You
By
COE-AI(CET-BBSR)- An Initiative by CET-BBSR ,Tech Mahindra and BPUT to provide to solutions to Real world problems through ML and IOT
Contact
email: b117020@iiit-bh.ac.in
Comments