Artificial intelligence is a field of computer science where a computer system can mimic human intelligence. On the other hand, Machine Learning is a subset of AI which allows a machine to automatically learn from past data to predict the future. ML uses algorithmic models to train from past data and make predictions. For example, an ML model can detect whether a person is wearing a mask if we train it first by showing it pictures of people wearing a mask. With sufficient data, we can train ML to detect a variety of things like gender analysis, weapon detection, facial recognition, amongst many others, which plays an integral part in video surveillance.
How does it work?
To understand how machine learning works let us take an example of an ATM where we want to identify a person who enters the ATM premises while wearing a helmet.
Data collection: The very first step in Machine learning is to collect the data which we are going to use to train our model. What this means is we need to find hundreds or even thousands of images of people wearing helmets. It is important that the data we are collecting is of good quality because this is going to determine our model’s accuracy down the line. The greater number of pictures of people in helmets we can collect, the better it is for our model. Having pictures from different angles and distances are things that will help our model learn better and identify more accurately.
Data Preparation: Consists of cleaning of data that may be required. For example, removing duplicate images, converting all images into the same data type like jpeg, etc. Then we need to Randomize our data, which essentially removes the effect of order in which data was collected. Finally, we need to split the data into a training set and evaluation set. Training set is the set of images we use to train our computer and evaluation set is the set of images that we use to check the efficiency of our model.
Training model: Here our goal is to make the right predictions as often as possible using our algorithm. For this, the most widely used software’s are Tensorflow and Keras. Both of these are open-source software which have simplified the way we train our models, and what used to take over 1000 lines of hard coding can now be done in 5 lines of code. Here our goal is to perform as many iterations as possible and in each iteration improve the accuracy of our systems.
Evaluating the model: Here we measure the performance of the model by testing it against previously unseen images. This gives us an idea of how the model is going to perform in the real world.
If the model’s accuracy is acceptable, it can then be deployed for commercial use. Even after deployment, the model can learn and improve itself as more and more data are fed to it.
AI & ML Applications in Video Surveillance
Over the last few years more and more companies have started investing a lot of money to incorporate ML and AI technologies into their everyday operations. Investment trends also highlight the importance and potential of these fields. According to brookings.edu [A1] India has seen a 361% growth in investment into AI and ML between 2015 to 2019. Not only India but AI is booming across the globe and is expected to grow exponentially over the next few years.
Recent advances in gathering of data along with increased efficiency of ML algorithms have led to a surge of interest in ML across domains. ML is making a positive impact across industries like healthcare, agriculture, transportation, etc. But one sector where AI is causing major transformation is security and surveillance. It is not only proving to be useful, but redefining the way we look at security and surveillance. With ML in video analytics, we can monitor things like footfall, suspicious movement, gender, sop adherence etc. in real time. This means we no longer need people to continuously monitor screens all day long and even the most complex of tasks can be automated. The number of resources saved in terms of both manpower and money is quite significant. For ex: over 300 ATM sites can be monitored effectively by only a team of 5 people.
Some of the key functionalities video analytics provides in video surveillance are briefly explained below:
Object recognition: A wide variety of objects can be detected using ML Algorithms. Depending on the needs of a company, the algorithms can be tweaked to identify different objects. For ex: For a retail store we can identify unwanted objects lying around the store to determine cleanliness whereas for an ATM, we can use object recognition to identify weapons and trigger an alarm as a security measure.
Motion detection: Motion detection is one of the most basic and widely used features that can be provided by video analytics. May it be a warehouse, restaurant, manufacturing unit, we can set up motion detection as a measure of night guarding to monitor unwanted movement in an area during a time when there should not be any movement at all.
People counting: Before AI backed analytics this was considered to be a tough task. It had to be done manually and was often inaccurate as continuously monitoring the entry and exit of people through an area was a tedious task. With the help of analytics this too is easily measured. It can be used by restaurants and retail outlets to gain key insights into their popularity. One step further for retail is that not only footfall but also a customer’s entire journey within a store can be mapped to gain more insights.
AI & ML are an important part of sensor-based surveillance. Detecting smoke, body heat, vibrations or toxic gases in the air, these are responsible for monitoring parameters beyond the scope of video surveillance.
Smoke detection: Smoke and fire can be detected by video analytics both in indoor and outdoor environments. Traditional smoke sensors based on thermal or chemical detection can take several minutes to react and need a large amount of smoke/fire to be triggered. Moreover, they cannot provide insights like location and size of fire, whereas video analytics can. Detecting the presence of smoke or flames, these sensors can be trained to send real-time alerts to escalations and after a specific period of time, can trigger fire safety alarms to go off.
Thermal detection: Thermal cameras improve accuracy in detecting motion by eliminating much of the detail and only focussing on different colours of different temperatures. When video analytics are tuned to be used with thermal cameras, alerts are only sent if there is motion with colour signatures between a specific temperature range. This removes motion from trees in the wind, birds, and other things.
Vibration detection: Cameras along with a set of vibration sensors can be used to safeguard any place. May it be an ATM, bank vault or a warehouse holding high valued items. The way this works is that any unwanted activity like someone trying to physically break into an ATM will trigger the vibration sensor and the camera’s recording of that moment can be used for verifying the threat.
Toxic emission detection: Sensors backed by AI can detect and monitor levels of different gases and warn whenever needed, for example: Smart smoke detectors in restaurants can continuously monitor levels of gases like carbon monoxide and send a signal to restaurant manager to start exhausts if the level of these gases cross a threshold.
Conclusion
We can essentially see that AI and ML technologies are already here and are here to stay. We repeatedly see that video analytics provides opportunity to various businesses to leverage these new technologies to improve their current operations in place. These technologies are already becoming a differentiating factor between companies. They give the company using them a significant competitive advantage by providing better results at a very reduced cost. At this point of time, it should be a great concern for any executive to understand how they can adopt these technologies for the benefit of their own company.