that most AI pilot projects fail — not because of technical shortcomings, but due to challenges in aligning new technology with existing organizational structures. While implementing AI models may seem straightforward, the real obstacles often lie in integrating these solutions with the organization’s people, processes, and products. This concept, commonly referred to as the “3P” pillars of project management, provides a practical lens for assessing AI readiness.
In this article, I introduce a framework to help teams evaluate and prioritize AI initiatives by asking targeted, context-specific custom questions across these three pillars, ensuring risks are identified and managed before implementation begins.
Whether you’re involved in the technical or business side of decision-making in the AI process, the concepts I outline in this article are designed to cover both aspects.
The challenge of implementing AI use cases
Imagine being presented with a list of over 100 potential AI use cases from across a global enterprise. Furthermore, consider that the list of use cases breaks down into a variety of specific departmental requests that the development team needs to deliver.
The marketing department wants a customer-facing chatbot. Finance wants to automate invoice processing. HR is asking for a tool to summarize thousands of resumes. Each request comes with a different sponsor, a different level of technical detail, and a different sense of urgency, often driven by pressure to deliver visible AI wins as soon as possible.

In this scenario, imagine that the delivery team decides to begin implementing what appears to be the quickest wins, and they greenlight the marketing chatbot. But, after initial momentum, the problems start.
First are the people problems. For example, the marketing chatbot stalls as two teams in the department can’t agree on who is responsible for it, freezing development.
After this issue is solved, process issues arise. For example, the chatbot needs live customer data, but getting approval from the legal and compliance teams takes months, and no one is available for additional “admin” work.
Even when this gets resolved, the product itself hits a wall. For example, the team discovers the “quick win” chatbot can’t easily integrate with the company’s essential backend systems, leaving it unable to deliver real value to customers until this issue is sorted.
Finally, after more than six months, budgets are exhausted, stakeholders are disappointed, and the initial excitement around AI has worn off. Fortunately, this outcome is precisely what the AI-3P framework is designed to prevent.
Before diving into the framework concept, let’s first look at what recent research reveals about why AI endeavors go off track.
Why do AI initiatives derail?
Enthusiasm around AI — or more precisely, generative AI — continues to peak day after day, and so we read numerous stories about these project initiatives. But not all end with a positive outcome. Reflecting this reality, a recent MIT study from July 2025 prompted a headline in Fortune magazine that “95% of generative AI pilots at companies are failing.”
Part of the report relevant to our purpose involves the reasons why these initiatives fail. To quote a Fortune post:
The biggest problem, the report found, was not that the AI models weren’t capable enough (although execs tended to think that was the problem.) Instead, the researchers discovered a “learning gap” — people and organizations simply did not understand how to use the AI tools properly or how to design workflows that could capture the benefits of AI while minimizing downside risks.
…
The report also found that companies which bought-in AI models and solutions were more successful than enterprises that tried to build their own systems. Purchasing AI tools succeeded 67% of the time, while internal builds panned out only one-third as often.
…
The overall thrust of the MIT report was that the problem was not the tech. It was how companies were using the tech.
…
With these reasons in mind, I want to emphasize the importance of better understanding risks before implementing AI use cases.
In other words, if most AI endeavors don’t fail because of the models themselves, but because of issues around ownership, workflows, or change management, then we have pre-work to do in evaluating new initiatives. To achieve that, we can adapt the classic business pillars for technology adoption — people and processes, with a focus on the end product.
This thinking has led me to develop a practical scorecard around these three pillars for AI pre-development decisions: AI-3P with BYOQ (Bring Your Own Questions).
The overall idea of the framework is to prioritize AI use cases by providing your own context-specific questions that aim to qualify your AI opportunities and make risks visible before the hands-on implementation starts.
Let’s start by explaining the core of the framework.
Scoring BYOQ per 3P
As indicated earlier, the framework concept is based on reviewing each potential AI use case against the three pillars that determine success: people, process, and product.
For each pillar, we provide examples of BYOQ questions grouped by categories that can be used to assess a specific AI request for implementation.
Questions are formulated so that the possible answer-score combinations are “No/Unknown” (= 0), “Partial” (= 1), and “Yes/Not applicable” (= 2).
After assigning scores to each question, we sum the total score for each pillar, and this number is used later in the weighted AI-3P readiness equation.
With this premise in mind, let’s break down how to think about each pillar.

Before we start to consider models and code, we should ensure that the “human element” is ready for an AI initiative.
This means confirming there is business buy-in (sponsorship) and an accountable owner who can champion the project through its inevitable hurdles. Success also depends on an honest assessment of the delivery team’s skills in areas such as Machine Learning operations. But beyond these technical skills, AI initiatives can easily fail without a thoughtful plan for end-user adoption, making change management a non-negotiable part of the equation.
That’s why the objective of this pillar’s BYOQ is to prove that ownership, capability, and adoption exist before the build phase starts.
We can then group and score questions in the People pillar as follows:

Once we are confident that we have asked the right questions and assigned the score on a scale from 0 to 2 to each, with No/Unknown = 0, Partial = 1, and Yes/Not Applicable = 2, the next step is to check how this idea aligns with the organization’s daily operations, which brings us to the second pillar.

The Processes pillar is about ensuring the AI use case solution fits into the operational fabric of our organization.
Common project stoppers, such as regulations and the internal qualification process for new technologies, are included here. In addition, questions related to Day 2 operations that support product resiliency are also evaluated.
In this way, the list of BYOQ in this pillar is conceptualized to understand risks in governance, compliance, and provisioning paths.

By finalizing the scores for this pillar and gaining a clear understanding of the status of operational guardrails, we can then discuss the product itself.

Here is where we challenge our technical assumptions, ensuring they are grounded in the realities of our People and Processes pillars.
This begins with the fundamental “problem-to-tech” fit, where we need to determine the type of AI use case and whether to build a custom solution or buy an existing one. In addition, here we evaluate the stability, maturity and scalability of the underlying platform, too. Apart from that, we also weigh the questions that concern the end‑user experience and the overall economic fit of the Product pillar.
As a result, the questions for this pillar are designed to test the technical choices, the end-user experience, and the solution’s financial viability.

Now that we’ve examined the what, the how, and the who, it’s time to bring it all together and turn these concepts into an actionable decision.
Bringing 3P together
After consolidating scores from 3P, the “ready/partially ready/not-ready” decision is made, and the final table looks like this for a specific AI request:

As we can see from Table 4, the core logic of the framework lies in transforming qualitative answers into a quantitative AI readiness score.
To recap, here’s how the step-by-step approach works:
Step 1: We calculate a raw score, i.e., Actual score per pillar
, by answering a list of custom questions (BYOQs). Each answer gets a value:
- No/Unknown = 0 points. This is a red flag or a significant unknown.
- Partial = 1 point. There’s some progress, but it’s not fully resolved.
- Yes/Not applicable = 2 points. The requirement is met, or it isn’t relevant to this use case.
Step 2: We assign a specific weight to each pillar’s total score. In the example above, based on the findings from the MIT study, the weighting is deliberately biased toward the People pillar, and the assigned weights are: 40 percent people, 35 percent processes, and 25 percent product.
After assigning weights, we calculate Weighted score per pillar
in the following way:

Step 3: We sum the weighted scores to get the AI-3P Readiness score, a number from 0 to 100. This score places each AI initiative into one of three actionable tiers:
- 80–100: Build now. That’s a green light. This implies the key elements are in place, the risks are understood, and implementation can proceed following standard project guardrails.
- 60–79: Pilot with guardrails. Proceed with caution. In other words, the idea has merit, but some gaps could derail the project. The recommendation here would be to fix the top three to five risks and then launch a time-boxed pilot to learn more about use case feasibility before committing fully.
- 0–59: De-risk first. Stop and fix the identified gaps, which indicate high failure risk for the evaluated AI initiative.
In summary, the decision is the product of the AI-3P Readiness
formula:

That’s the process for scoring an individual AI request, with a focus on custom-built questions around people, processes, and products.
But what if we have a portfolio of AI requests? A straightforward adoption of the framework to prioritize them at the organizational level proceeds as follows:
- Create an inventory of AI use cases. Start by gathering all the proposed AI initiatives from across the business. Cluster them by department (marketing, finance, and so on), user journey, or business impact to spot overlaps and dependencies.
- Score individual AI requests with the team on a set of pre-provided questions. Bring the product owners, tech leads, data owners, champions, and risk/compliance owners (and other responsible individuals) into the same room. Score each AI request together as a team using the BYOQ.
- Sort all evaluated use cases by AI‑3P score. Once the cumulative score per pillar and the weighted
AI-3P Readiness
measure is calculated for every AI use case, rank all the AI initiatives. This results in an objective, risk-adjusted priority list. Lastly, take the top n use cases that have cleared the threshold for full build and conduct an additional risk-benefit check-up before investing resources in them.

Now let’s look at some important details about how to use this framework effectively.
Customizing the framework
In this section, I share some notes on what to consider when personalizing the AI-3P framework.
First, although the “Bring Your Own Questions” logic is built for flexibility, it still requires standardization. It’s important to create a fixed list of questions before starting to use the framework so that every AI use case has a “fair shot” in evaluation over different time periods.
Second, within the framework, a “Not applicable” (NA) answer scores 2 points (the same as a “Yes” answer) per question, treating it as a non-issue for that use case. While this simplifies the calculation, it’s important to track the total number of NA answers for a given project. Although in theory a high number of NAs can indicate a project with lower complexity, in reality this can sidestep many implementation hurdles. It would be prudent to report an NA‑ratio per pillar and cap NA contribution at perhaps 25 percent of a pillar’s maximum to prevent “green” scores built on non‑applicables.
The same is valid for “Unknown” answers with score 0, which present a full “blind spot,” and possibly should be flagged for the “de-risk first” tier if the knowledge is missing in specific categories as “Ownership,” “Compliance,” or “Budget.”
Third, the pillar weights (in the example above: 40 percent people, 35 percent processes, 25 percent product) should be viewed as an adjustable metric that can be industry or organization specific. For instance, in heavily regulated industries like finance, the processes pillar might carry more weight due to stringent compliance. In this case, one might consider adjusting the weighting to 35 percent people / 45 percent processes / 20 percent product.
The same flexibility applies to the decision tiers (80–100, 60–79, 0–59). An organization with a high-risk tolerance might lower the “build now” threshold to 75, whereas a more conservative one might raise it to 85. For this reason, it’s relevant to agree on the scoring logic before evaluating the AI use cases.
Once these elements are in place, you have everything needed to begin assessing your AI use case(s).
Thank you for reading. I hope this article helps you navigate the pressure for “quick AI wins” by providing a practical tool to identify the initiatives that are ready for success.
I’m keen to learn from your experiences with the framework, so feel free to connect and share your feedback on my Medium or LinkedIn profiles.
The resources (tables with formulas) included in this article are in the GitHub repo located here:
CassandraOfTroy/ai-3p-framework-template: An Excel template to implement the AI-3P Framework for assessing and de-risking AI projects before deployment.
Acknowledgments
This article was originally published on the Data Science at Microsoft Medium publication.
The BYOQ concept was inspired by my discussions with Microsoft colleagues Evgeny Minkevich and Sasa Juratovic. The AI‑3P scorecard idea is influenced by the MEDDIC methodology introduced to me by Microsoft colleague Dmitriy Nekrasov.
Special thanks to Casey Doyle and Ben Huberman for providing editorial reviews and helping to refine the clarity and structure of this article.