AR+DeepLearning Implementation in Smart Glasses
The Augmented Reality sector has recently started to have some big attractions amidst this Covid pandemic with most of the things becoming “GET FROM HOME”. The concept of Work from Home ensured the adoption of AR/VR/MR technologies to a greater extent.
The global augmented reality (AR) and virtual reality (VR) market is expected to grow by USD 125.19 billion during 2020–2024. The development of AR android applications is enabling users in the retail segment to provide customers with new experiences. Virtual fitting rooms help buyers choose the right size and for business it reduces the purchase return. A similar benefit is associated while choosing the colour for a new car or furniture. Here, we tried a similar experiment of creating a cross-pollination between AR and ComputerVision trained Deep Learning models to find out parts/components in a car.
Data Preprocessing Stage
Initially, I took videos of the car in various angles and colours and generated frames. PS: Each second of a video has around 24–30 frames depending on the frame-rate of the device. These frames/images are used to create a deep-learning model where the images are annotated for the parts; then the data is split into training and testing data and then the model is trained.
Here, I used the COCO Mobilenet SSD model and trained my data, and generated a .PB file. The PB file is generally of higher size, this model can be deployed only on higher GPU devices as the mobile devices could not afford such a model to be run on them. The model built is converted into a TFLITE model (.tflite format) where the model could be read by mobile devices. The normal model we built is 40MB in size which can’t be read by device GPU and hence the model has to be converted to a lite model where the size would be lesser than 5 MB.
Refer to this link for the conversion of .pb file to .tflite conversion,
I am using VUZIX M400 smart glasses which is entirely based on Android OS. So, the deployment gets easier as this glass is similar to that of an android device, where we create an android application supporting TensorFlow. This application is built using Android Studio where the apk generated can be used on any android based device.
Ensure the following steps are followed to create the required results,
1) To create a new project that runs on Tensorflow we have to click on File -> New -> Other -> TensorFlow Lite Model. Then add the pre-required libraries to our Gradle
2) We are supposed to add the required dependencies to our app.gradle file
implementation ‘org.tensorflow:tensorflow-lite:2.3.0’
androidTestImplementation ‘androidx.test.ext:junit:1.1.1’
androidTestImplementation ‘com.google.truth:truth:1.0.1’
androidTestImplementation ‘androidx.test:rules:1.1.0’
3) Provide permissions in our android manifest file to access the camera
<uses-permission android:name=”android.permission.CAMERA” />
<uses-feature android:name=”android.hardware.camera” />
<uses-feature android:name=”android.hardware.camera.autofocus” />
4) The .tflite model is kept inside the assets folder of the project along with the labels.txt file. The labels.txt file consists of the list of classes that are trained in the model.
Here, the assets and labels that are loaded earlier are accessed through the code, and then using android’s basic graphics, we fetch the bitmap images of the camera screen from video.
Using the file TFLiteObjectDetectionAPIModel, the bitmaps that are generated get pre-processed; the pre-processed images are then sent for recognition. The bitmap images are processed through the TensorFlow-lite image detection where the objects get detected. Upon, using the multi-box tracker, the bounded boxes are drawn on top of the detected results.
As I was trying to implement it for an automobile company, the general use case was to identify the range of the vehicle, ie; identify whether that particular vehicle belongs to base variant or top-end variant and perform the detection based on it. In case, if we are detecting a wrong part, say if we detect normal wheel instead of alloy wheel, the bounded box is drawn with a wrong sign and if we detect the part correctly as expected for the vehicle range, the bounded box is drawn with a tick mark.
Refer to the implementation video,
Credits to Khush Patel for his article that helped me to build my Deep Learning model
https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087
Credits to Mohanraj V for his support and guidance on helping me to build the model file.