Essentials in ML/AI product management

Managing ML/AI projects for Product Leaders and Managers — Part 2: Managing ML projects

The key factors for successfully managing ML projects include:
- Having the right mindset
- Agile and iterative project management
- Collaboration and tools
- Roles and responsibilities
- Continuously measure performance
- Adaptability and feedback loops

Nima Torabi
21 min readMay 20, 2023

Comprehending the role of Machine Learning (ML) and Artificial Intelligence (AI) and their impact on products is increasingly crucial for competent product managers. The high failure rate of ML projects, as widely reported, is attributed mainly to factors unrelated to the models themselves.

Machine Learning (ML) is a set of methods and tools which help realize the goal of the field of Artificial Intelligence (AI). Deep Learning (DL), or the use of Neural Networks (NN) containing many layers, is a subfield of ML. Furthermore, areas such as Computer Vision, Natural Language Processing (NLP), and Recommendation Systems are sub-fields of AI that rely on ML methods and Deep Learning to deliver value
Machine Learning (ML) is a set of methods and tools which help realize the goal of the field of Artificial Intelligence (AI). Deep Learning (DL), or the use of Neural Networks (NN) containing many layers, is a subfield of ML. Furthermore, areas such as Computer Vision, Natural Language Processing (NLP), and Recommendation Systems are sub-fields of AI that rely on ML methods and Deep Learning to deliver value

By adopting best practices in identifying ML opportunities, carefully considering key design decisions for ML systems, and implementing a disciplined approach to ML project management, product leaders and managers can greatly enhance the chances of success and significantly reduce the prevailing high failure rates.

To deliver value and achieve success in ML/AI projects, competent product managers should focus on performing five critical tasks:

  1. Identify and frame problems: it is crucial to identify suitable opportunities where ML can address user/customer problems effectively. Product managers should frame these problems in a way that allows for the design of ML-based solutions
  2. Organize ML projects using CRISP-DM: understanding and implementing the CRISP-DM (Cross-Industry Standard Process for Data Mining) data science process helps in organizing ML projects and coordinating team efforts efficiently
  3. Grasp the data-related aspects of ML projects: product managers must clearly understand data-related considerations when building ML systems. This includes identifying data requirements, exploring potential data sources, establishing data governance and access protocols, and recognizing the importance of data cleaning and preparation before modeling.
  4. Design ML systems and select technologies and tools: familiarity with the critical elements of designing ML systems is essential. Product managers should consider various factors when choosing technologies and tools for ML projects, ensuring optimal choices are made
  5. Manage the model lifecycle: even after a model is released, its performance needs to be actively managed. Product managers should monitor and maintain models over time, ensuring that they continue to perform effectively in an evolving environment.

By fulfilling these tasks, competent product managers can maximize the value and success of ML/AI projects.

Photo by Andrea De Santis on Unsplash

Part 2: Organizing ML projects

Projects in Data Science related work often involve exploration and start with numerous unknowns and uncertainties. Teams may not know which data or features are crucial for the model, which algorithms will be most effective, or what level of performance can be realistically achieved. Therefore, extensive exploration is necessary.

During the exploration phase, ML product teams should follow a disciplined process to maximize their chances of success, and there are three steps to it:

  • Organizing ML projects using the CRISP-DM framework (Cross-Industry Standard Process for Data Mining)
  • Structuring the ML team and defining roles
  • Organizing project teamwork using best practices and tracking progress

Challenges of ML vs. normal software projects

ML projects pose unique challenges compared to normal software projects, making them more difficult to manage. Some of the significant differences include:

  • Broader skill set requirements: ML projects typically require diverse skills across their data scientists and ML engineering talent. Managing the collaboration and coordination between these different roles can be challenging due to the specialized expertise involved
  • A higher degree of technical risk: ML projects involve inherent technical risks due to the uncertainties associated with ML models. The need for extensive experimentation and rapid iteration to develop effective models introduces complexity and increases the potential for technical challenges
  • Data needs may require significant upfront work. ML projects often require extensive upfront work to gather and prepare the necessary data. This includes identifying data sources, accessing and cleaning data, addressing missing values, and determining the most critical features for the model. Up to 80% of the project’s time can be dedicated to data-related tasks
  • ML models are probabilistic rather than deterministic. ML models provide probabilistic outcomes, making it impossible to achieve 100% of expected performance. Defining what constitutes “good enough” performance becomes crucial, requiring clear performance objectives and minimum threshold metrics that deliver sufficient business value to users/customers
  • Model building is somewhat of an art. Building ML models involves exploration, trial and error, and a non-linear approach. There are no defined steps to follow, and it requires patience and a creative mindset similar to artists approaching their work
  • Model outputs can be highly variable. Once a model is deployed, the output can vary significantly as it encounters new data and generates new predictions. This variability needs to be taken into account when using the model’s outputs for decision-making
  • The ML model will have inherent limitations. ML models have inherent limitations based on the phenomenon they are trying to model. There will always be a certain level of performance that cannot be exceeded due to the intrinsic variability or randomness of the targeted problem
  • Challenging to plan and estimate: Unlike traditional software projects that often follow a linear development path, ML projects involve continuous iteration and experimentation. This makes it challenging to develop accurate timelines and estimate budgets, as development can move forward and backward based on the results of ongoing experimentation
  • Difficulty in showing progress: The iterative nature of ML projects can make it challenging to demonstrate meaningful progress to managers and executives. Traditional milestones may not align well with the ongoing experimentation and refinement process, requiring, mostly, organizations with a solid understanding of the ML/AI domain to appreciate the progress being made
  • ML projects will require managing the change they bring to the users. ML products can significantly change users’ workflows, requiring proper onboarding and education to ensure smooth adoption. Users may need to adapt their existing toolsets and processes to integrate the ML product effectively. Therefore, building trust will be critical for the successful adoption of ML products. Users need to trust the model’s outputs and understand its limitations. Clear and transparent communication is essential to establish trust, as ML models can often be perceived as black boxes

The CRISP-DM data science process

Having a well-defined process for solving data science and ML problems is crucial for several reasons:

  • Clear problem definition: having a process ensures that we invest time upfront to properly define and understand the problems we’re trying to solve. Without a clear problem definition, teams will end up wasting a significant amount of resource
  • Problem-solving efficiencies: following a process ensures that ML teams do things in a logical order and perform the right tasks at each step ensuring efficient use of resources and comprehensive problem-solving outcomes. For example, if ML teams start modeling without cleaning and processing their data, the quality of the model will be compromised
  • Quality control proof: an effective process helps ensure the quality of the work being done by setting standards and guidelines for each stage of the project, ensuring that tasks are performed correctly and consistently which helps identify and address issues early on, minimizing errors and reducing the risk of delivering subpar results
  • Collaboration and organization: having processes will provide a framework for organizing work and allocating responsibilities among team members by defining roles and responsibilities, clarifying expectations, and facilitating collaboration with each team member knowing their specific tasks and how they contribute to the overall project goals
  • Continuous improvement: having a process will allow continuous improvement and learning from past experiences through which ML teams can identify areas for improvement, refine methods, and incorporate lessons learned into future projects. This iterative approach helps teams evolve and achieve better results over time

CRISP-DM

While multiple processes have been published for data science-related work, CRISP-DM is the most common one. The advantage of CRISP-DM is that it was developed as a highly flexible and industry-agnostic approach to data mining or ML projects, meaning that it can be applied to projects no matter what industry it’s in. Today, 25 years later, CRISP-DM is still the most widely used data science project methodology in which major corporations use a process that is some slight variant or derivative of CRISP-DM.

The CRISP-DM process framework

The CRISP-DM process consists of six main steps:

  • The business understanding phase is where the focus is on accurately defining the problem that needs to be addressed
  • The data understanding phase is where the available data is collected and comprehended
  • The data preparation phase is when data is processed, cleaned, and prepared to be suitable for modeling
  • The modeling phase is where models are built to address the defined problem
  • The evaluation phase is where the quality of the models is assessed, and one with the preferred output is selected.
  • The deployment phase is when the chosen model is deployed and the final product is introduced to users

The CRISP-DM process is designed to be iterative, allowing for flexibility and continuous improvement. Each step within the process is inherently iterative and can be revisited and refined based on new insights and learnings. The entire process itself is also iterative, as we may deploy an ML product and gather feedback from users/customers, leading us to reevaluate and adapt our understanding of the problem to be solved. This feedback-driven iteration enables us to refine and enhance our solutions, ensuring ongoing improvement and adaptation throughout the entire process.

Below is the breakdown of each phase and what needs to be done for each to be accomplished.

Phase 1: understanding the business

There are three steps to the business understanding phase: 1) defining the problem, 2) defining success, and 3) identifying relevant factors to the problem
There are three steps to the business understanding phase: 1) defining the problem, 2) defining success, and 3) identifying relevant factors to the problem

Phase 2: understanding the data

There are three steps to the data understanding phase: 1) gathering the data, 2) validating it, and 3) exploring it
There are three steps to the data understanding phase: 1) gathering the data, 2) validating it, and 3) exploring it

Phase 3: data preparation

There are three steps to the data preparation phase: 1) splitting the data, 2) determining the feature set, and 3) preparing for modeling
There are three steps to the data preparation phase: 1) splitting the data, 2) determining the feature set, and 3) preparing for modeling

Phase 4: Modeling

There are two steps to the modeling phase: 1) model selection, 2) model tuning
There are two steps to the modeling phase: 1) model selection, and 2) model tuning

Phase 5: Evaluation

There are two steps to the evaluation phase: 1) evaluate the results, and 2) test the solution

Phase 6: Deployment

There are two steps to the deployment phase: 1) deploy, and 2) monitor

The various roles in ML project teams

When assembling ML project teams, four critical factors need to be considered:

  • The size of the team depends on the project scope. The structure of ML teams can vary greatly depending on the organization and the specific project. There is no one-size-fits-all approach, and teams can be structured differently based on the size and type of the organization. Some teams may be larger with multiple members, while others may consist of a single person or a small group. The team should be assembled to effectively utilize the available skills and promote collaboration and synergy among team members
  • Team structures can vary. The choice of team structure depends on various factors such as the organization’s size, culture, project complexity, and resource availability. There are usually two types of ML project team structures: 1) Project-Based — where the entire team is dedicated to a specific project and reports to a single project manager or team lead. This type of structure allows for a high level of focus and coordination on the project’s objectives. 2) Matrixed teams — where individuals may have multiple roles or responsibilities across different projects or departments within the organization. This structure can be beneficial when the organization wants to leverage resources efficiently across multiple projects or initiatives. Matrixed teams often require strong communication and collaboration skills, as team members need to manage their time and priorities effectively across different projects. It also allows for sharing of expertise and knowledge across different areas within the organization. Both project-based and matrixed structures have their advantages and trade-offs, and organizations should select the structure that best aligns with their goals and capabilities
  • Titles don’t matter too much. Understanding the role and responsibilities (R&Rs) of team members is more important than the specific title they hold. The titles given to individuals can vary across organizations and even within the same industry. What matters most is having clarity on the functions and tasks associated with each role. By focusing on the R&Rs, you can ensure that the right expertise and skills are present within the team to effectively execute the ML project. This allows for clear communication, delegation of tasks, and collaboration among team members.

What matters most in ML teams is understanding the various roles and skill sets required for the project. Regardless of the team size, it is crucial to have individuals with the necessary expertise and competencies in areas such as product management, data science, machine learning, software engineering, data engineering, domain knowledge, project management, and communication.

Team organization in ML teams

Typical ML team structure across product management, data science, machine learning, software engineering, data engineering, domain knowledge, project management, and communication — some roles may have more than one person, or some people may have more than one role
Typical ML team structure across product management, data science, machine learning, software engineering, data engineering, domain knowledge, project management, and communication — some roles may have more than one person, or some people may have more than one role

On a typical machine learning project team, you will find members from the product, data science, and engineering teams with specialized roles and responsibilities assigned to each team member:

  • Product team: A) Product Owners develop technical requirements and guide product development. B) Product Managers research and translate market needs into technical requirements, interface with other departments, and prepare marketing and sales teams for product promotion. C) Product Designers focus on creating user-friendly UI/UX for the ML product. They collaborate with the product team to understand user needs, conduct user research, and design intuitive interfaces that enhance user engagement and satisfaction.
  • Data Science team: Data Scientists analyze data and build the models that power the product.
  • Engineering team: A) Data Engineers collect, clean, and manage data for the project working closely with data scientists to create data pipelines and prepare data for modeling. B) Software Engineers integrate the developed model into the larger product, creating the interface between the model and the product. C) Machine Learning Engineers develop production-grade models, implement data pipelines, and collaborate with software engineers to integrate the models into the product. D) QA members conduct testing of the model and product to ensure quality. E) DevOps teams handle the infrastructure and deployment of the model and product.
  • Other teams: additionally, the project team interfaces with members from other functions, consultants, and domain experts within and external to the organization including Sales, Marketing, and Customer Support who provide input on the project direction and are involved in commercializing the product as it nears maturity. Domain experts provide deep knowledge and insights about the specific industry or field the ML project is targeting. Their expertise helps align the project with domain-specific requirements and ensures the accuracy and relevance of the models.

It’s worth noting that in smaller teams, individuals may fulfill multiple roles, and in startup or entrepreneurial settings, a single person may take on many or all of these roles.

Data Scientists vs. ML Engineers

In an ML project compared to a normal software team, there are two distinct roles: the data scientist and the ML engineer:

  • The Data Scientist: typically comes from a statistical or data science background with programming skills. Ideally, they possess domain expertise relevant to the project’s field or industry. Their responsibilities include: 1) data gathering, processing, and extracting insights, 2) Implementing the ML approach by determining the appropriate ML techniques, evaluating algorithms, and experimenting with different models, and 3) conducting early-stage exploration and prototyping to shape the modeling effort
  • The ML Engineer: generally has a computer science or engineering background with experience in software engineering and training in ML. Their responsibilities include: 1) management of the production data pipeline which develops the infrastructure and data pipeline for deploying the ML system in a production environment, 2) working closely with the data scientist to implement and integrate the model into the broader product, and 3) collaborating with the software engineering and DevOps teams to launch the product into the market

While the data scientist focuses on data analysis, modeling, and early-stage work, the ML engineer takes charge of developing the production-grade system, integrating the model, and deploying the product. Their collaboration is crucial to successfully transitioning from data exploration and prototyping to a fully functioning and deployable machine learning solution.

Involvement of team members in ML project cycles

Throughout the project life cycle, different roles have varying levels of involvement and engagement with the project. Various roles may have overlapping responsibilities and collaboration throughout the project life cycle, and their involvement can vary based on the specific project requirements and team dynamics

Throughout the project life cycle, different roles have varying levels of involvement and engagement with the project:

  • Product Manager(s) and Owner(s): are involved from project initiation to deployment and beyond. They provide continuous guidance, prioritize requirements, and manage the product throughout its life cycle.
  • Data Engineer(s): play a significant role in the early stages of the project as they identify data sources, and collect and set up the data pipeline. Their involvement may decrease slightly as the project progresses toward deployment.
  • Data Scientist(s): have heavy involvement early on in the project as they are responsible for guiding data collection, analysis, and prototyping of models. Their collaboration with the ML engineer(s) becomes crucial later in the project for developing production-grade models.
  • ML Engineer(s): along with the Software Engineering team, become more involved as the project progresses working closely with Data Scientist(s) to transition from prototyping to developing production-grade models. They collaborate on integrating the model into products and launching them to the market.

It’s important to note that these roles may have overlapping responsibilities and collaboration throughout the project life cycle, and their involvement can vary based on the specific project requirements and team dynamics.

The role of the project champion

Another critical role on ML project teams is that of the project or business sponsor or champion. The project champion, typically a manager or executive within the organization, plays a vital role in ensuring the success of the project. Some key aspects of the project champion’s role include:

  • Resource allocation: the project champion secures the necessary resources for the project, including budget, personnel, and infrastructure, ensuring that the project team has the support needed to carry out their work effectively.
  • Ensuring alignment with the corporate strategy: the project champion ensures that the project’s goals and outcomes align with the broader corporate strategy and objectives by providing strategic guidance and ensuring that the project contributes to the organization’s long-term vision.
  • Risk Management: given the inherent uncertainty and technical risks associated with ML projects, the project champion acts as a protector and advocate for the team. They shield the team from unnecessary business pressures and distractions, allowing them to focus on project delivery.
  • Ongoing support: the project champion remains involved throughout the project, offering ongoing support and guidance to the team by helping them navigate challenges, resolve conflicts, and provide necessary approvals and endorsements.

Having a project champion is crucial for ML projects as they provide the necessary leadership and support to overcome obstacles, secure resources, and keep the project aligned with the organization’s strategic goals.

Running the project: being agile and collaborative

ML projects do not follow a linear path from start to finish but iterative processes where teams continuously learn and refine their understanding as the project progresses through the CRISP-DM steps. Additionally, it is crucial to involve and validate project progress with customers to ensure it’s on the right track. Here’s a breakdown of how this iterative experimentation, which is to be agile, could look like:

  • Exploring hypotheses: start by forming a concept or hypothesis for a solution which could be a partial solution or an initial idea. This hypothesis needs to be presented to the customer and their input researched.
  • Incremental build-up: using the CRISP-DM process, ML product teams gradually build upon the solution, adding more steps and refining their approach which may involve data collection, data preparation, model development, and evaluation.
  • Customer feedback: the solution is presented to the customer and observed in action with the intent to gather feedback which could include their observations, suggestions, or concerns.
  • Analysis and learning: customer feedback is assessed and valuable insights are extracted. The team identifies strengths and weaknesses and determines areas for improvement.
  • Adjusting and repeating: based on the analysis, hypotheses are adjusted and solutions refined and the iterative experimentation process is repeated.

By continuously iterating and adjusting ML development based on customer feedback, the product team enhances the accuracy and effectiveness of the solution ensuring that the solution is aligned and evolves with the needs and expectations of the customers.

How the iterative steps and agile processes for an ML project could look
How the iterative steps and agile processes for an ML project could look

Collaboration cadence

Similar to the flow of any other agile s/w project, collaboration plays a crucial role in ML projects, and establishing an effective cadence of engagement is essential. Several key opportunities for engagement and collaboration within ML teams include:

  • Roadmapping sessions: these sessions occur monthly or quarterly, bringing the team together to discuss customer input, align priorities, and set the roadmap for the upcoming period. It helps ensure that everyone is on the same page regarding the project’s direction.
  • Sprint planning and reviews: the team engages in these sessions on a biweekly basis (or potentially weekly) breaking down the roadmap into smaller sprints, typically lasting two weeks, and planning the tasks and deliverables for each sprint. Sprint reviews allow the team to evaluate the progress made during the sprint and gather feedback.
  • Daily stand-ups: daily stand-up meetings are short, focused meetings held within the team. They provide an opportunity for team members to share updates on their tasks, discuss any challenges or roadblocks, and ensure everyone is aware of the current status and progress. Daily stand-ups promote communication and coordination among team members.
  • Demo sessions: Regular demo sessions, held weekly or biweekly, allow the team to showcase their work visually. It provides an opportunity to present progress, demonstrate functionality, and gather feedback from stakeholders, including potential customers. These sessions help guide the project’s direction and ensure alignment with stakeholders’ expectations.

By incorporating these collaboration opportunities into the project’s cadence, the team can foster effective communication, synchronization, and feedback exchange. It promotes a collaborative environment where team members can work together efficiently and align their efforts toward achieving project goals.

Collaboration tools

Furthermore, with the collaboration cadence setup in place, selecting the right collaboration tools is crucial for effective teamwork and project management. Some commonly used tools in ML agile projects include:

  • Roadmap and requirements management: Confluence, Google Docs, or dedicated roadmap tools that allow teams to document and manage project requirements, track progress, and communicate updates. These tools provide a centralized platform for capturing and organizing information, ensuring everyone is aligned with project goals.
  • Project tracking: Jira, Trello, or other similar project management tools help track user stories, plan sprints, and monitor progress. They provide features such as task assignment, progress visualization, and backlog management that enable teams to stay organized and prioritize work effectively.
  • Collaboration and version control: Git and GitHub, GitLab, and BitBucket are widely used for collaboration and version control in software development, including ML projects. They allow team members to collaborate on code, manage branches, track changes, and ensure version control.
  • Communication tools: such as Slack, Microsoft Teams, or Discord foster real-time collaboration, and facilitate discussions, file sharing, and notifications. These tools enhance team communication, especially in distributed or remote teams.

When selecting collaboration tools factors such as team size, project complexity, and specific requirements and project settings play a crucial role. It’s crucial to choose tools that align with the team’s workflow, promote seamless collaboration, and improve productivity throughout the project lifecycle.

Measuring performance

Measuring performance in an ML project involves assessing both outcome metrics and output metrics. It’s important to note that outcome metrics reflect the business value generated by the model, while output metrics assess the technical performance of the model itself. Both types of metrics are important for evaluating the success of an ML project.

During the project, the team continuously evaluates both outcome and output metrics to track progress, make adjustments, and validate that the developed solution is effectively delivering the desired business impact. By monitoring these metrics, the team can identify areas for improvement, iterate on the model, and refine the solution to better align with the intended outcomes.

Outcome metrics

These metrics capture the desired business impact and are focused on the overall goals of the project. They are typically expressed in terms of the business outcomes or benefits that the organization or customer expects to achieve. Outcome metrics are often related to financial gains, cost savings, time efficiencies, or customer satisfaction. The measurement of outcome metrics is aligned with the desired business impact rather than the technical performance of the model.

Examples of outcome metrics:

  • Increase in revenues or profits
  • Cost reduction achieved through automation
  • Time saved in performing a specific task
  • Improvement in customer satisfaction ratings

Output metrics

These metrics focus on the technical performance and effectiveness of the ML model or system being developed. They provide insights into how well the model is performing and whether it meets the desired criteria. Output metrics are internal and help guide the development and improvement of the model throughout the project. They are typically defined based on the requirements of the desired outcome metrics. The measurement of output metrics is important for evaluating the performance of the model, making improvements, and ensuring that it aligns with the desired business outcomes.

Examples of output metrics:

  • Mean squared error (MSE) for regression tasks
  • Accuracy, precision, recall, or F1 score for classification tasks
  • The area under the receiver operating characteristic (ROC) curve

Tracking metrics

Throughout an ML project, the tracking of metrics depends on the phase and purpose of the evaluation. Here’s a breakdown of when and how these metrics are tracked:

Tracking output metrics

  • Training and validation phase: output metrics are tracked during the training and validation phase of the project. This involves comparing different models, evaluating various algorithms, and tuning hyperparameters. Output metrics are used to assess and select the best-performing model.
  • Testing phase: output metrics are further evaluated using a dedicated test set to assess the model’s performance before deployment. This helps validate the model’s effectiveness and ensure it meets the desired performance criteria.
  • Post-deployment monitoring: output metrics continue to be monitored even after the model is deployed with customers. This ongoing tracking helps identify any degradation in performance over time or the need for adjustments due to changes in the external environment.

Outcome metrics tracking

  • Model and product development: outcome metrics are typically measured during the creation of the model and development of the product. The focus is on assessing whether the desired business impact can be achieved for the customer.
  • Hindsight scenario testing: historical data is used to simulate scenarios where the customer could have used the product. By applying the model retrospectively, the outcome metrics are evaluated to determine the potential impact of the product on past scenarios.
  • A/B testing: in collaboration with early customers, A/B testing is conducted to compare the impact of using the product against their usual operations. This helps quantify the difference in outcomes and evaluate the effectiveness of the product.
  • Beta testing: before full commercial deployment, a beta testing period is conducted with a select group of early adopter customers. This allows for close collaboration, gathering feedback, and monitoring the achieved outcomes. It ensures that the model not only performs well but also delivers the desired business outcomes for the customer.

Non-performance considerations

In addition to performance metrics, there are important non-performance considerations in an ML project to keep in mind. By considering these non-performance factors, ML teams can make informed decisions about model selection, data requirements, interpretability needs, and the overall feasibility of applying ML to a particular project. This helps ensure that the chosen approach aligns with the project’s objectives, constraints, and potential risks.

Some of these considerations include:

  • Explainability or interpretability: models that are explainable or interpretable 1) make it easier to debug and identify biases within the datasets or the model itself, 2) are crucial when working on fault-intolerant projects where the consequences of incorrect outputs are severe helping understand the model’s decision-making process and ensuring fairness, accountability, and compliance with regulations. In fault-tolerant projects, such as movie recommendations, minor errors might not have significant consequences while, fault-intolerant projects, such as evaluating graduate school applicants, require careful consideration.
  • Cost Considerations: project managers need to assess that the data and computational costs of ML projects align with the project’s budget and infrastructure capabilities. 1) Data Costs: sourcing and storing data can come with expenses and it’s important to carefully consider the data requirements, including the type, volume, and storage duration, to optimize costs, 2) Computational Costs: training and retraining models, as well as running inference, have computational requirements.
  • Suitability of Machine Learning: project teams need to assess whether ML is the right tool for the job. In some cases, alternative approaches may be more appropriate and cost-effective, such as rule-based systems or traditional statistical methods. Consider the complexity and characteristics of the problem, availability of data, interpretability requirements, and potential trade-offs between accuracy and explainability when making decisions.

By considering these non-performance factors, ML teams can make informed decisions about model selection, data requirements, interpretability needs, and the overall feasibility of applying ML to a particular project. This helps ensure that the chosen approach aligns with the project’s objectives, constraints, and potential risks.

The ML mindset

ML projects are inherently iterative and non-linear. While having a process such as CRISP-DM provides a structured framework, it’s crucial to understand that the steps are not necessarily followed in a strict linear sequence.

During ML projects, it’s common to iterate and refine throughout the process based on the learnings from experiments and feedback. Some key aspects of organizing and managing ML project include:

  • Employ an iterative approach: ML projects often involve conducting numerous small-scale experiments to validate hypotheses, evaluate models, and make adjustments. Each experiment provides insights and learnings that inform subsequent steps and help refine the project direction.
  • Develop feedback loops: gathering feedback from stakeholders, including customers and end-users, is essential to ensure the project stays on the right track. Regularly seeking feedback, analyzing it, and incorporating it into the process allows for course corrections and adjustments as needed.
  • Learn and adjust: the learnings from experiments and feedback should be carefully analyzed and considered. If necessary, adjustments to the project plan, data preparation, model selection, or feature engineering may be required. It’s perfectly acceptable to go back and revisit previous steps to incorporate new insights and improve the overall outcome.
  • Have flexibility and adaptability: ML projects often involve dealing with uncertainty and technical risks. It’s important to maintain a flexible mindset and be prepared to adapt to new information, unexpected challenges, or changes in requirements. This adaptability enables the team to make informed decisions and pivot if necessary.

By embracing an iterative and adaptive approach, ML projects can effectively navigate the uncertainties and complexities inherent in working with data and models. It allows for continuous learning, and improvement, and increases the chances of achieving desired outcomes.

--

--

Nima Torabi
Nima Torabi

Written by Nima Torabi

Product Leader | Strategist | Tech Enthusiast | INSEADer --> Let's connect: https://www.linkedin.com/in/ntorab/

No responses yet