Monitoring machine learning models is critical

Monitoring machine learning models is critical


Because of changing assumptions and constantly changing data, the work does not end after deploying machine learning models to production. These best practices ensure the dependability of complex models.

Agile development teams must ensure that microservices, applications, and databases can be observed; that there is monitoring in place to detect operational issues; and that AIops is used to correlate alerts into manageable incidents. When users and business stakeholders ask for changes, many devops teams use agile methods to process feedback and roll out new versions.

Even if there are few requests, devops teams understand the importance of upgrading apps and patching underlying components; otherwise, today's software will become tomorrow's technical debt.

Machine learning model life-cycle management is more complex than software life-cycle management. "The model development life cycle resembles the software development life cycle from a high level, but with much more complexity," explains Andy Dang, cofounder and head of engineering at WhyLabs. We think of software as code, but data, the foundation of an ML model, is complex, multidimensional, and unpredictable.

Models are built with algorithms, configuration, and training data sets in addition to code, components, and infrastructure. During the design process, these are chosen and optimized, but they must be changed as assumptions and data change over time.

Why should machine learning models be monitored?

Machine learning model monitoring, like monitoring applications for performance, reliability, and error conditions, gives data scientists visibility into model performance. ML monitoring is especially important when models are used to make predictions or when the ML runs on volatile datasets.

"The main goals around model monitoring focus on performance and troubleshooting, as ML teams want to be able to improve on their models and ensure everything is running as intended," says Dmitry Petrov, cofounder and CEO of Iterative.

Rahul Kayala, Moveworks' principal product manager, provides this explanation of ML model monitoring. "Monitoring can assist businesses in balancing the benefits of AI predictions with their requirement for predictable outcomes," he says. "Automated alerts can help machine learning operations teams find outliers in real time, giving them time to act before any harm is done."

"Combining robust monitoring with automated remediation speeds up time to resolution," says Stu Bailey, cofounder of ModelOp. This is important for maximizing business value and reducing risk.

Data scientists, in particular, must be notified of unexpected outliers. "AI models are frequently probabilistic, which means they can produce a wide range of results," says Kayala. "Models can occasionally produce an outlier, a result that is significantly outside the normal range." Outliers can have a significant negative impact on business outcomes if they go unnoticed. To ensure that AI models have a real-world impact, ML teams should also monitor trends and fluctuations in product and business metrics that AI directly affects.

Consider the prediction of a stock's daily price. When market volatility is low, algorithms like long short-term memory (LSTM) can make rudimentary predictions, while more comprehensive deep learning algorithms can improve accuracy. But when markets are very volatile, most models have trouble making accurate predictions, and monitoring models can let you know when this is happening.

Classifications are performed by another type of ML model, and precision and recall metrics can help track accuracy. Precision compares true positives to those chosen by the model, whereas recall measures a model's sensitivity. ML monitoring can also find ML model drift, like concept drift, which happens when the statistics behind what is being predicted change, or data drift, which happens when the data that is being used to make the model change.

A third issue is explainable ML, in which models are pushed to determine which input features contribute the most to the results. This problem is related to model bias, which happens when the data used to train the model has statistical flaws that cause it to make wrong predictions.

These issues can erode trust and cause major business problems. Model performance management tries to deal with these problems during all of the phases: development, training, deployment, and monitoring.

Fiddler's chief scientist, Krishnaram Kenthapadi, believes that explainable ML with low bias risk necessitates model performance management. "To ensure that ML models are not overly discriminatory, enterprises require solutions that provide context and visibility into model behaviors across the entire life cycle—from model training and validation to analysis and improvement," Kenthapadi says. "Model performance management makes sure that models can be trusted and helps engineers and data scientists find bias, track the root cause, and explain why things happened when they did in a timely manner."

Machine learning monitoring best practices

Modelops, ML monitoring, and model performance management are all terms for practices and tools used to ensure that machine learning models perform as expected and provide reliable predictions. What underlying practices should data science and development teams take into account when implementing?

"Model monitoring is a critical, ongoing process," says Josh Poduska, chief field data scientist at Domino Data Lab. To improve the future accuracy of a drifting model, retrain it with newer data and associated ground truth labels that are more representative of current reality. "

Ira Cohen, chief data scientist and cofounder of Anodot, discusses key aspects of ML model monitoring. "First, models should monitor the behavior of output and input features, as changes in input features can cause issues," he says. He says to use proxy measures when the performance of a model can't be measured directly or quickly enough.

According to Cohen, data scientists require tools for model monitoring. "Monitoring models by hand is not scalable," he says. "Dashboards and reports are not set up to handle the complexity and volume of monitoring data that comes from deploying a lot of AI models."

The following are some best practices for ML model monitoring and performance management:

"Make sure you have the tools and automation in place upstream at the start of the model development life cycle to support your monitoring needs," Petrov says.

"Data engineers and scientists should perform preliminary validations to ensure their data is in the expected format," Dang says. As data and code pass through a CI/CD pipeline, they should allow for data unit testing via validations and constraint checks. "

"Use scalable anomaly detection algorithms that learn the behavior of each model's inputs and outputs to alert you when they deviate from the norm," Cohen suggests.

"Track the drift in feature distribution," Kayala says. A significant shift in distribution indicates that we need to retrain our models to achieve peak performance.

"Organizations are looking more and more to monitor model risk and return on investment as part of more comprehensive model governance programs," says Bailey. "This is to make sure that models meet business and technical KPIs."

Software development is primarily concerned with code maintenance, application performance monitoring, improving reliability, and responding to operational and security incidents. Data that is always changing, volatility, bias, and other factors in machine learning mean that data science teams need to manage models and keep an eye on them while they are in use. 

Post a Comment

Previous Post Next Post