The customer is an American company that offers a smartphone-based solution for small and mid-size truck fleets. The solution was designed to help meet federal electronic logging device requirements while minimizing the financial impact of traffic offenses. Commercial truck carriers and drivers are bound by the constraints of the Hours of Service (HOS) rules from the Federal Motor Carrier Safety Administration (FMCSA). If a truck driver fails to follow HOS regulations, it can potentially lead to a trucking accident. The major feature of the customer’s product is an automatic tracking and reporting tool and for logging the driver’s service hours.
The customer asked us to develop the fleet management portal with real-time equipment tracking, which required both front and back end development, along with Android and iOS apps. Since the customer’s product was not the only solution available within the market, we needed the solution to stand out amongst competitors. After one year of collaboration, our team developed a system that can predict the probability of different types of HOS violations for a driver within a 24 hour timeframe.
We needed to develop from scratch a prototype within extremely tight deadlines. In as little as two months, our designers had to prove that the task could be solved using a machine learning approach.
About 190 GB of drivers’ work history data was collected during the product’s functioning. The data included hours logged, location updates and more, which made it a crucial element for developing machine learning (ML) algorithms. ML is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make predictions. With this in mind, we provided a solution comprised of three independent data processing layers. The first one is the data access module, which processes raw data from MySQL and DynamoDB to get consolidated data samples. The output of this component is provided as an input for the feature engineering layer that reduces the amount of valuable data up to 5 MB in turn. Then, the core layer separates extracted features into two data sets for training the ML model and its validation.
During implementation, the development team faced two main challenges that were successfully overcome. It was not possible to extract any data for training ML model on the fly since the data was stored separately inside several databases, in MySQL and in DynamoDB in particular. Furthermore, there are four types of HOS violations and there is no common solution for such types of prediction problems.
The system is able to predict cases of HOS violations with an accuracy level of roughly 80%. The prototype makes possible to do one full data processing iteration within one day. Different ML models are suitable for various HOS violation types, thus ensuring the highest accuracy. The team proved that the obtained result can be improved by analyzing more data points, adjusting chosen ML models and making statistical analysis to get more critical HOS violation factors.
- Builds graphics for statistical analysis
- MySQL and DDB compilation into JSON
- Processes data to generate features
- Finds optimal parameters for ML models
- Execution time reduction
- Amazon DynamoDB
- SVM, Random Forest, Gradient Boosting, etc.