Baseline modeling and its critical role in AI and business performance

Baseline modeling is a necessary part of model validation. In our expert opinion, it should be required before model deployment. There are many baseline modeling types and in this episode, we're discussing their use cases, strengths, and weaknesses. We're sure you'll appreciate a fresh take on how to improve your modeling practices.

Show notes


Introductions and news: why reporting and visibility is a good thing for AI 0:03

  • Spoiler alert: Providing visibility to AI bias audits does NOT mean exposing trade secrets. Some reports claim otherwise.
  • Discussion about AI regulation in the context of current events and how regulation is playing out between Boeing and the FAA (tbc)

Understanding baseline modeling for machine learning 7:41

  • Establishing baselines allows us to understand how models perform relative to simple rules-based models, aka heuristics.
  • Reporting results without baselines to compare against is like giving a movie a rating of 5 without telling the listener that you were using a 10-point scale.
  • Baseline modeling comparisons are part of rigorous model validations and should always be conducted during early model development and final production deployment.
  • Pairs with analyses of theoretical upper bounds for modeling performance to show how your technique scores between acceptable worst and best case performance.
  • We often find complex models being deployed in the real world that haven’t proven their value over simpler and explainable baseline models

Classification baselines and model performance comparison 19:40

  • Uniform Random Selection - simulate how your model does against a baseline model that guesses classes randomly like a dice.
  • Most Frequent Class (MFC) - the most telling test and often the most telling test in the case of highly skewed data with inappropriate metrics.
  • Single-feature modeling - Validates how much the complex signal from your data and model improves over a bare minimum explainable model.
  • And more…

Exploring regression and more advanced baselines for modeling 24:11

  • Regression baselines: mean, median mode, Single-variable linear regression, Lag 1, and Least 5% re-interpretation
  • Advanced baselines in language and vision

Conclusions 35:39

  • Baseline modeling is a necessary part of model validation
  • There are differing flavors of baselines that are appropriate for all types of modeling
  • Baselines are needed to establish fair and realistic lower bounds for performance
  • If your model can’t perform significantly better than a baseline consider scrapping the model and trying a new approach

Talk to us

Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

  • LinkedIn - Episode summaries, shares of cited articles, and more.
  • YouTube - Was it something that we said? Good. Share your favorite quotes.
  • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.