A Ligthweight Wayfinding Assistance System for IoT Applications

: In this paper, we propose to design an indoor sign detection system for industry 4.0. In order to implement the proposed system, we proposed a lightweight deep learning-based architecture based on MobileNet which can be run on embedded devices used to detect and recognize indoor landmarks signs in order to assist blind and sighted during indoor navigation. We apply various operations in order to minimize the network size as well as computation complexity. Internet of things (IoT) presents a connection between internet and the surroundings objects. IoT is charac-terized to connect physical objects with their numerical identities and enables them to connect with each other. This technique creates a kind of bridge between the physical world and the virtual world. The paper provides a comprehensive overview of a new method for a set of landmark indoor sign objects based on deep convolutional neural network (DCNN) for internet of things applications.


Introduction
New digital information is created every second split. In order to overcome the huge development of visual information, we have to develop new applications in order to analyze this new information. Indoor objects and sign detection present a very challenging problem in various computer vision applications as internet of things (IoT) [1] and industry 4.0 [2] applications. Digital transformations support new aspects of our daily life in order to facilitate our daily activities. According to the World Health Organization (WHO) [3], the number of aged persons is getting increasing from one year to another. Also, according to the latest statistics of WHO, 185 million persons around the world suffer from visual impairments, 246 million persons have moderate to severe impairments and 39 million are totally blind. The use of IoT is widely popular in the latest few years and expanded to touch new interesting fields. Especially, it is getting more to drag attention of researchers in artificial intelligence applications by mixing new smart engineering applications.
Indoor sign detection and recognition serve as a basic component for various vital capabilities. It aims to decide if an indoor sign is present in the image or not. Smart sensors as cameras present principal devices to perform indoor object detection and others applications as object detection [4.5], indoor scene recognition [6], object classification [7,8] and object segmentation [9]. Internet of things for object detection can incorporate all forms of artificial intelligence in order to facilitate life and ensure a better life for people. IoT presents a new trending component wherein various classical and new approaches can be included. Indoor sign detection presents a very challenging computer vision tasks it should be able to recognize and locate objects whatever their color, shapes, textures and point of view.
Nowadays, computer vision presents an emerging research that can touch various applications. It is very necessary to detect and recognize objects from input images and video sequences. There is an increasing need to develop new applications for indoor objects detection in order to help blind and sighted persons to more explore their indoor surroundings and to ensure for them a safer indoor navigation.
In the industry 4.0 environments, new applications are introduced to better understand the surrounding world. The main contribution of industry 4.0 is to ensure a full automation for the computer vision applications and to contribute for more effectiveness to meet human requirements. Industry 4.0 aim to integrate a new set of technical solutions by combining smart machines and systems to ensure better applications process. The aim of using new innovative technologies of the industry 4.0 is to show the newest innovations in technologies as well as implementing new ways of assistive technologies to fully assist blind and sighted persons to explore more their surroundings.
Developing new applications for industry 4.0 require fast and reliable communication in order to ensure real-time processing for the proposed systems. Nowadays, we live in a down industrial revolution of internet, wireless sensors, artificial intelligence, internet of things (IoT) and cloud computing. The effect of developing this type of application is usually referred to the industry 4.0. new innovative technologies combining cooperation between objects localization and recognition and wireless mobile technologies to be used by blind and sighted persons to facilitate for their more their life. Internet of things is also named internet of everything or also industrial internet. It presents a part of ubiquitous computing. Various applications introduced this technique as smart homes [10]. Indoor signage detection and recognition is an essential challenge in computer vision area by matching the shape, color, illumination and the orientation of the desired object in the image. Objects detection issues are currently implemented in industry 4.0 technologies. Signage detection system can be defined by the following process: features extraction, localization and classification. Features extraction present the fundamental part of pattern recognition and object detection problem. Its primary objective is to extract the most important features from input data in order to better detect and localize the objects present in images and videos. The industry 4.0 presents a major target for academics and industrial as well industry 4.0 present the fourth evolution which is based on digitation and smart manufacturing. It is able to connect physical world objects with the industrial internet of things (IoT).
Building new assistive systems that ensure autonomy for robots, blind or sighted persons present a high factor for industry 4.0. The employment of indoor sign detection systems allows for the sighted persons or robots to take a place in the indoor environment based on the information provided by the camera which gathers the object pose, color, texture and its distance from the camera. This fact enables the user to have an idea about the surrounding objects which can be applied to more complex tasks.
Nowadays, huge amount of data should be communicated in order to ensure the user by interesting information that can help him during its indoor navigation. However, due to the fast-paced manufacturing techniques, it is necessary to develop new objects detection systems based on the company servers and cloud or workstations.
In order to build a robust indoor wayfinding assistance system used for industry 4.0, we have to ensure a necessary infrastructure in order to carry inputs and outputs to communicate devices via internet. The streaming of captured images via the mobile phone camera require a reliable internet services in order to ensure a good communication between the user mobile phone and the cloud or the workstation. New artificial intelligence techniques become an outstanding trend to be applied to the industry 4.0 sector these days.
Our aim from this work is to develop a new assistive system for blind and impaired persons to be applied to the industry 4.0(shown in Figure 1). we propose to design an industry ready approach for indoor sign detection. We trained and test the proposed indoor sign classification system using our proposed dataset. The proposed designed system for industry 4.0 domain will be builded based on the camera of the mobile phone of the user, an internet connection and the cloud who will process the captured images and send back information to the user.

Related work
One of the most challenging independent tasks for cognitive tasks for blind and sighted persons is the indoor object detection for indoor assistance navigation. It is not simple to detect precisely some specific indoor objects highly recommended for their indoor assistance navigation.
Internet of Things (IoT) acts as a basic technique in order to build new assistance systems which contributing to smart cities [11], navigation assistance [12] and autonomous vehicles [13]. Industry 4.0 vision presents factories systems by connecting various components to each other's. Nowadays, internet of things presents one of the current used technologies. It allows the connection of physical world to virtual world. Garanter's researchers [14] estimate that the number of IOT devices that will be connected by the end of 2020 is about 20 billion devices. As processing huge amount of data, traditional systems transmit the data to be processed in cloud platforms. However, processing data in cloud platforms presents three major limitations which are processing latency, transmit bandwidth and the power consumption. Building new assistive systems require high transmission bandwidth. To tackle the problem of cloud computing, edge and fog computing has been introduced.
Nowadays, everything is becoming smart with the help of artificial intelligence and IoT techniques. In [15], authors proposed edge-based street object detection. This system can be included in the smart cities systems. This system detects 14 objects with 25 as average of precision of detection on NVIDIA Jetson TX2.
Real-time visual object tracking presents a very promising prospect in various computer vision applications as: autonomous vehicles, robotic vision and smart cities. Building an object tracking system for IoT edge systems is very important to think about the energy efficiency of the algorithm. By this way, the obtained can fully use the computation resources of the IoT edge. In [16], authors proposed a review of the sensor's localization in IoT infrastructure. Object location information presents very important task for Wireless Sensor Networks (WSNs). Internet of Things and edge computing presents major process used for data communication, data transfer and data collection. This technique can be widely used for smart cities applications. In [17], authors proposed an object tracking Mouna Afif et al.

of 9
system using fog computing. This system is lightweight as it presents a limited computation capacity and memory storage.
In the last few years, much known companies as IBM, Samsung and Amazon converged to the strategies of IoT paradigms, hardware modules and services [18]. In [19], authors proposed an interactive web-based IOT management system. The proposed system can automatically recognize devices from input video streaming. By using this system, the user can choose a device by it touching in a video. IoT devices provide small size memory and little memory storage. In [20], authors proposed a new framework for object detection and tracking of moving objects in an edge computing architecture. The developed system was builded using YOLO architecture. Tracking performance is around 96% which presents very interesting results.
The reminder of the rest of this paper is the following: Section 3 provides an overview of the proposed architecture for indoor sign recognition. Section 4 outlines the conducted experiments and results. And section 5 concludes the paper. In this section, we will introduce our proposed indoor signage recognition system which aims to classify the sign present in the input image or video. The problem that we treat present one of the most common problems in computer vision that presents a large variety of practical applications. The proposed indoor sign recognition system will be based on the mobile phone camera and a cloud, by this way, the user can interact with objects or "things" in a simpler manner.

Proposed Approach for Indoor Sign Recognition
Internet of Things gives the opportunity of extending the internet to the physical objects which enables human to live. We present in this section a new design of an internet of things-based deep learning system for indoor way finding assistance. This technique is based on a network of devices that are connected in order to exchange the data to participate in a broad set of applications. The proposed approach takes as an input image from the camera of the user mobile phone that will be sent to the cloud via internet infrastructure. The cloud will be charged to process and apply the proposed deep convolutional neural network architecture in order to recognize and classify the indoor sign image. After that, the cloud will resent the identified sign to the user mobile via internet connectivity. We focused our efforts on indoor sign recognition as it plays an important role in indoor wayfinding assistance for a large category of persons. The proposed indoor signage detection(shown in Figure 2) system can widely help the user to recognize signs and to explore more its surroundings environments. The proposed system ensures for the user to find its desired destinations by recognizing a set indoor landmark sign. Indoor wayfinding presents the task which includes the user position and guiding it through indoor routes to reach its desired destination. Indoor navigation assistance has a set of applications as it is an important problem in computer vision area. The proposed indoor wayfinding assistance system is developed using the lightweight convolutional neural networks MobileNet v1 [21] and v2 [22]. The proposed indoor wayfinding assistance system is able to identify and classify a set of four indoor landmark signs (wc, exit, disabled exit and confidence zone). Thanks to the breakthroughs in the hardware components, the applications of deep learning-based techniques in IoT are rapidly developed.
Aiming to develop our proposed indoor sign recognition system, we used MobileNet v1 and v2 in order to ensure a mobile and a lightweight implementation of the proposed indoor wayfinding assistance system. generally, the most computational complex part in the neural network architecture is the convolution layers. The big idea behind the introduction of MobileNet family is the use of depthwise separable convolution blocks instead of classic convolutions. Depthwise separable convolution is a combination of a depthwise convolution and a 1x1 convolution, called pointwise convolution. Depthwise convolution and pointwise convolution forms the depthwise separable convolution. It performs approximately the same operation as the traditional convolution but in a fast way. The architecture of MobileNet v1 consists of a regular convolution layer of 3x3 filter kernel followed by 13 depthwise separable convolution presented in the following Figure 3. In the depthwise separable convolution, the convolution layers are followed by batch normalization. The activation layers used are RELU6. At the end of MobileNet v1 architecture, there's an average pooling layer, a fully connected layer and finally a softmax layer. In order to obtain more precision for the proposed indoor wayfinding assistance system, we used for our second experiments MobileNet v2. MobileNet v2 present the rethanked version of MobileNet v1 architecture. MobileNet v2 introduces a new powerful block named bottleneck residual block.  Figure 4) introduces also a residual connection which connects the input to the output in order to reduce the complexity of computation.

Experiments and Results
We trained and tested the proposed indoor sign detection system based on our collected indoor sign dataset which provides 800 indoor sign images (wc, exit, disabled exit, and confidence zone)., 200 images for each class. The proposed indoor sign dataset provides various conditions as occlusion, different lighting conditions, different point of view of objects.
During the proposed experiments, we divided the dataset into two main parts: train and test parts. We believe that the four classes are very relevant and that can widely improve the life quality for blind and visually impaired persons to fully participate in the daily life. The proposed work relies on assisting them assisting persons to, reach their desired destinations by assisting them by ensuring for them an efficient indoor wayfinding system. in order to develop the proposed work, we used python and TensorFlow deep learning frameworks.
We used two architectures to develop the proposed system, mobileNet v1 and v2. In order to build a robust system, during the training process, we make sure of using challenging conditions images. The proposed experiments were conducted using a HP workstation with a Quadro M4000 GPU with 8 GB of graphic memory. In addition, in order to ensure more the robustness of the proposed indoor wayfinding assistance system, we evaluated the proposed work using two versions of neural networks.
Deep learning-based applications area is getting developing and leading to obtain good detection accuracies especially in object recognition. Based on this fact, we used a very known technique named "transfer learning" [23] in order to transfer knowledge from one task to be applied to a second one. It is extremely simple for a human vision system to recognize the surrounding objects, but it is difficult and challenging for computers or impaired persons to recognize these objects with the same level of perception. As training a deep neural network from scratch is very time consuming and depend on a huge amount of memory storage, we used transfer learning technique to reduce the time consumption and the calculation complexity.
The following Table 1 provides all the experiments settings used for the two implementations conducted in our experiments. We note that images proposed in our dataset were isolated from their surroundings environments in order to be suitable for classification and recognition tasks. We address in this paper the problem of indoor assistance navigation for blind and sighted persons to ensure for them a safer navigation by detecting objects and obstacles. When training the two neural networks, we make sure to use various challenging images in order to ensure more robustness for the proposed developed system.
As first experiment, we trained and tested our proposed detection system at first time using MobileNet v1. The following table presents a per-class sign recognition rates. We note that we obtained very useful results for indoor sign detection. We obtained 99.7% as an identification precision for the exit, disabled exit and the confidence zone classes and we obtained 99.8% for the wc rate. So, we obtained 99.72% as an average precision for the four indoor sign classes.
At a second time, we trained and tested the second indoor sign detection system using the second version of MobileNet family, MobileNet v2. The following Table 3 reports the obtained recognition rates obtained when using MobileNet v2 as a neural network. As the table reports, we obtained very encouraging results in terms of recognition rate for all the signs classes. We obtained 98.82% as an average recognition rate for the four classes.

Conclusions
We propose in this paper, a computer vision method based on deep learning algorithms which address the indoor signage detection and indoor wayfinding assistance for blind and visually impaired persons. The designed system can be applied for the industry 4.0 and internet of things IoT area. The proposed experiments were conducted using two lightweight convolutional neural networks MobileNet v1 and v2. We obtained very encouraging results in terms of recognition rate which can be included in internet of things in order to participate to develop a new application of industry 4.0 that can fully assist blind and sighted persons to more participate in the daily life.
As a future work, we propose to evaluate the proposed designed of things-based indoor sign recognition system in order to ensure the blind and the sighted persons use their life quality by exploring and understanding more their surrounding environments.