The customer is an American start-up providing automatic security solutions for businesses and government institutions. The company actively adopted a machine learning approach for real-time object detection and action recognition tasks in video streams from their customers’ security cameras.
The customer’s company was developing a brand new security solution for outdoor surveillance. Orion was asked to create a PoC (proof of concept) for action recognition using machine learning techniques. Our experts had to deliver the solution at short notice operating with limited resources, including both design resource for implementation and hardware resources for the selected platform.
Action recognition is known to be an unsolved problem until now with limited accuracy of state-of-the-art solutions. One reason for this is lack of good datasets for such tasks. Thus, our specialists had to deal with less than 30 min of videos available for labeling. The other problem is that the system should recognize actions from different angles of the camera. The team had to cope with both issues.
Bearing in mind limited hardware capabilities and lack of dataset provided by the customer, our engineers conducted a brief study to determine potential solutions which can show appropriate results with a limited dataset. Several neural network architectures were selected as possible candidates for PoC, including: activity detection based on the single frame using CNN, activity detection using multiple frames. The last option showed more promising results during experiments. Therefore, the team had checked a few other options: 3DCNN and LSTM.
Eventually, the most encouraging results were demonstrated by an approach combining transfer learning (using pre-trained Inception v3 model) and LSTM. This model was selected as the final solution for PoC.
We provided the customers with the reliable Proof of Concept they needed for further solution development. Our experts successfully accomplished the task in as little as 3 months, despite resource constraints and lack of dataset. The models were integrated into the web server allowing the customers to perform live-action demonstrations for their clients.