Unfolding the universe of possibilities..

Navigating the waves of the web ocean

Crossing the AI Chasm: How OpenAI turned LLMs into a mainstream success

Crossing the AI Chasm: How OpenAI Turned LLMs into a Mainstream Success

And why LLMOps will suffer the same fate as MLOps

I’ve been a vocal skeptic about the viability of ML developer tooling (broadly categorized as MLOps) as standalone businesses and, with very few exceptions, I’ve been proven right. The lack of a dominant design has led to fragmented “micro-markets” with very little value capture, mostly because of open source alternatives and cloud vendors giving their ML tools away for free (to collect revenue on the infrastructure layer). So what led LLMs to blow right past these problems, receive breakout media attention, and achieve real widespread adoption? And what is going to happen to all of the startups throwing the MLOps playbook at LLMs, rebranding as LLMOps?

In this post I’ll use the “diffusion of innovation” theory as well as the concept of “crossing the chasm” in an effort to explain my bullish expectations of LLM providers like OpenAI or Anthropic, and my bearish view on the attempt to resurrect MLOps as LLMOps.

The adoption of innovations and the chasm

According to Everett Rogers’ “Diffusion of Innovations”, innovative products are adopted progressively by different groups of adopters with distinct traits. Innovators, who are willing to take risks and have a high tolerance for failure, are the first to try out a new product. Laggards, who have an aversion to change, are the last. The famous bell-curve shaped graph shows the percentage of adopters in each category, and the corresponding graph of cumulative adoption resembles the familiar “S curve” pattern of an innovation’s market share over time.

Image by author (modified from source)

The basic idea is that each group is influenced by signals and behaviors of the preceding groups, relying on social proof to inform their decision to adopt a new product. This is a well understood and empirically documented phenomenon, observed in anything from window AC units to iPhones.

The “chasm” is a concept popularized by Geoffrey A. Moore’s “Crossing the Chasm” that builds on Rogers’ theory. Moore argues that the differences between the early and the mainstream markets are too large and that most products die trying to bridge that “chasm”, which is a fairly common failure mode in tech startups.

Image by author (modified from source)

Although Rogers criticized the concept of the chasm by saying that the diffusion of innovation is a “social process” with “no sharp breaks or discontinuities between adjacent adopter categories”, it should be obvious that many products fail to reach the mainstream because they never make it past the Innovator group.

Moore provides several suggestions on how to bridge the chasm that I only partially agree with. One observation is that, in his own words, his book primarily treats the chasm “as a market development problem” and focuses “on marketing strategies and tactics for crossing it”. He does cover the idea of “whole product management”, but based on his reading of Theodore Levitt’s “The Marketing Imagination” that concept is limited to bridging the gap between marketing message and product truth with “services and ancillary products”. He does not address the actual evolution of the product. In fact, the innovation (aka the core product) is treated as a constant.

Taking the specific attributes of software (particularly developer tools) into account, I propose two strategies (“evolve” and “skip”) for avoiding the chasm and hypothesize how their application helped fuel the rapid rise of LLMs.

Two product-centric ways of avoiding the chasm

Evolve (simplify) your developer tools over time

The limitation that the product is a constant, while all other aspects of the “whole product” (like messaging, distribution, pricing) change to appeal to different adoption groups, is mostly motivated by physical products. If you’re in the business of producing and selling widgets, changing your supply chain or retooling your factories is not a trivial thing to do. However, this is an entirely different story with products that are exclusively software. Not evolving your software product is almost always a recipe for failure.

The need to evolve should be obvious based on how most software startups start out these days. More often than not, developer tools (especially in AI) are born and nurtured amidst a strong and devoted user base of experts in a specific field. It may not come as a surprise that these early users are usually Innovators and, as such, are not representative of the broader market. It is far too easy for founders to spend all of their time and energy on this segment and tweak their products based on their feedback. Unfortunately, commercial success is rarely found in those first groups. Innovators are very sophisticated and often prefer to build vs. buy. Even if they decided to buy they wouldn’t represent a big enough market.

One solution to this problem is to evolve the product over time for different target audiences. With well-designed developer tools this means introducing new layers of abstractions and/or supporting more widely used languages. To use an example from my previous employer, the ongoing success of Spark is (at least in my opinion) partially due to the fact that the product surface has continuously been simplified to attract a wider range of users (dare I say the Early Majority?). Spark started out with RDDs (Resilient Distributed Datasets) and Scala as its main programming language. Then it expanded language support to Python with PySpark (opening up to a broader set of software engineers) and introduced simpler APIs like the DataFrame, as well as SparkSQL (opening up to SQL analysts). More recently, Spark added a Pandas-compatible API (opening up to Data Scientists) and even introduced an “English SDK” using LLMs (opening up to, well, anyone who knows English). If Spark had not evolved in this manner it would have been stuck in the Innovator segment of experts that know how to write intricate MapReduce programs in Scala.

Image by author

This strategy seems somewhat obvious but not many technology products (especially in developer tooling) get this right. They sometimes “simplify” the product by removing some knobs but fail to introduce new layers of abstraction that are not leaky.

Skip the chasm entirely

Another approach, which is less common in developer tools, is to skip the chasm entirely. The idea is deceivingly simple: If success in the early market doesn’t automatically translate to success in the mainstream market, why not directly target the early majority?

As mentioned before, this is more important in hardware where iterations on a product are slower, more costly, and as a result the core product can’t evolve as easily. The iPhone is a great example of a product that frequently gets criticized by Innovators (even as recent as the iPhone 15 and its “disappointing” USB-C port) but achieved rapid success with the Early Majority who didn’t care about these technical details. In fact, Apple repeatedly teaches the industry a masterclass on this strategy with their messaging. Probably the most famous example is the “1,000 songs in your pocket” campaign, which was targeted towards the Early Majority, not Innovators who care about technical specifications.

Image by author

This seems unnatural to many tech startups (especially those focusing on developer tooling) because it’s just too easy to achieve early success with innovators and early adopters. AI developer tools start out in the early market almost by definition, since they are usually built by and for advanced AI researchers or ML engineers. The practice of “proving product<>market fit” as measured by GitHub stars by open sourcing a project just reinforces this.

Common failed strategies in commercializing open source projects

I’ve seen enough “open source project turned startup” to have at least some level of “pattern recognition” for common failure modes. These startups find early success (and funding) when they experience growing adoption as measured by GitHub stars or PyPI downloads. Then they tragically follow similar paths, sometimes even if there is an experienced founder who “has done it before” (because they don’t actually understand why their previous companies succeeded).

Image by author

Upsell Innovators: Intuitively (or naively?), most startups first attempt to upsell Innovators with a “managed” version of the open source product. This strategy usually falls flat because early Innovators, by definition, are very sophisticated and prefer to build vs. buy. The generic 3S strategy (managed OSS + stability, scalability, security) is not sufficient for this audience to justify writing a check, since they already know how to build and run services themselves. Innovators also fear “vendor lock-in” and losing their ability to innovate independently.

Product Market Mismatch: The next attempt is to sell the same “Managed OSS” product to the Early Majority. That usually fails because the core offering is still the same hard-to-use product that has been optimized for Innovators. Just adding 3S is not sufficient to incentivize the Early Majority to upskill (like the plans to train up millions of ML engineers to force the MLOps market into existence). If that wasn’t enough, the final nail in the coffin is that no one beats AWS at this game (which is also the reason why more and more infrastructure open source projects switch to non-commercial licenses).

“Whole Product”: I call this strategy “whole product” sarcastically, because this term has been misused to fill fundamental product gaps by suboptimal means. This attempt usually follows the realization that the core product is too hard to use for a larger market, and the solution commonly involves “throwing humans at the problem”. This leads to a high service component in a startup’s revenue structure (which no investor likes to see) and bloated delivery organizations. To be fair, some amount of this is necessary, particularly in the enterprise segment or federal. But, more often than not, the startup starts looking like a tech consulting company.

A hybrid approach for developer tools

The strategy I am proposing is a hybrid approach that still allows for rapid iteration with a devoted user base of Innovators but acknowledges the fundamental differences in the early and mainstream markets by explicitly focusing on the Early Majority in product definition.

Proving out early success with Innovators through open source doesn’t have to be at odds with finding commercial viability with the Early Majority if you recognize that they require different products. Specifically, I suggest to:

Use your open source project to gain popularity with InnovatorsUse that popularity to raise moneyUse the Innovator group to find out how they are creating downstream value and for whomTarget your mainstream product to that audience

This is where the Diffusion of Innovation for software is different from consumer hardware like iPhones: The key insight is that, in the software value chain, Innovators are often the middlemen (middlepersons?) to the Early Majority. Put differently, Innovators themselves are not the end of the value chain. They consume technology to help product/business teams create value. Sometimes that takes the shape of a “Center of Excellence” or a centralized “Innovation Team”. The goal of a tech startup should be to learn who sits in the value chain after those Innovators, which is where they will find the key to the Early Majority. Critically, I am not saying that you should try to disintermediate those Innovators in organizations where they exist, because that usually leads to a political backlash. In those cases you need to make them your “champions”.

The goal of a tech startup should be to learn who sits in the value chain after those Innovators, which is where they will find the key to the Early Majority.

The main implication of the “skip” strategy is to make an explicit decision during product definition to address the Early Majority. Note that this is different from the “evolve” strategy in that the “mainstream product” may not simply be an easier version of your original product, but may take an entirely different shape. The two extremes of this different shape are:

A higher level of abstraction than the original OSS project, in a different form factor. Although imperfect, Databricks provides another example for this. The breakout product that led to initial interest outside of the Innovators group was not just “managed Spark” but a managed Notebook product for Data Scientists and Engineers (which, at that point in time, was quite novel). Databricks continues to follow the same strategy today with products like Databricks SQL.A more focused verticalized product higher in the value chain. Stripe is a great example as they originally started out with an open source payment processing library and then found success with products like Checkout (a full payment form for websites) or Terminal (point-of-sales checkout terminals).

How MLOps failed to evolve and LLMs skipped the chasm

MLOps got stuck in the early market

A similar story to the one I shared about the evolution of Spark cannot be told about ML. The MLOps stack looks pretty much the same as it did a few years ago, and the hope in the market is that more and more engineers will learn how to use it.

Without reminiscing about how we got here, let me just briefly summarize my opinion on the state of the MLOps market:

The MLOps market has not converged on a “dominant design” and, as a result, every “MLOps platform” you will find is different in both obvious and subtle ways.On a systems level, the MLOps market hasn’t produced a simpler “form factor” or levels of abstraction, so it is still prohibitively complex and requires several specialized roles (Data Engineers, Data Scientists, ML Engineers, etc.) that are only prevalent in the most advanced tech companies.The audience who can consume this technology, namely Innovators and Early Adopters, prefer to live on the cutting edge and use open source tools instead of paying a vendor.The audience who would be willing to pay a vendor usually just defaults to what the main cloud service providers are offering. Cloud providers are giving the “ML Platform” layer away for free and are content with collecting revenue on storage and compute.Since cloud vendors haven’t monetized MLOps explicitly, the value capture in this market has been minimal.

In summary, MLOps has fallen into the chasm and there’s no sign of it reemerging on the other side.

The Early Majority Appeal of LLMs

Enter LLMs. OpenAI reportedly passed $1.3B in ARR and is expected to keep growing at a rapid pace, which can’t be said about MLOps startups. In fact, you could probably add up the top 10 MLOps startups’ revenue and not even get close. Remember that most cloud providers don’t actually monetize this layer outside of charging for compute and storage, so their “ML” revenue doesn’t really count (unless you want to get into the business of providing cloud infrastructure at commodity prices).

This begs the question, why did LLMs achieve such mainstream success so quickly? I’d argue that they successfully skipped the chasm in two very different segments.

Skipping the chasm in the developer segment

Traditional “discriminative” ML models are trained for very specific tasks, like predicting the quality of a sales lead or ranking a list of products. For each one of those tasks, several experts have to work together to write data pipelines, collect labels, refine so-called “features”, train and fine-tune models, evaluate them, deploy them, monitor their performance, and then retrain them periodically. Both the need to repeat this for every task, as well as the amount of expertise required to pull it off, meant that this miracle was attainable only by a select few.

“Generative” language models, on the other hand, “just work” for a wide variety of use cases, enabling anyone who can make an API call to apply AI to their product or problem. Almost overnight, LLMs solved the talent shortage in the “applied AI” space by giving every software engineer AI superpowers. Critically, the same LLM could generate poems, write code, translate natural language questions into SQL queries, or pass a wide variety of standardized tests. This works either “just out of the box” (zero-shot), or simply by giving the model a few examples of the problem you want to solve (few-shot) and extends to a wide variety of modalities, not just text.

Image by author

This is the very definition of skipping the chasm and going straight for the Early Majority.

It’s also why LLMOps is bound to repeat history. I guess if you have a hammer, everything looks like a nail. Inevitably a cottage industry emerged around the idea that everyone needs to train and fine-tune their own LLMs, which is missing the whole point of why LLMs have been so successful in the first place. Adding back the complexity of writing your data pipelines, training and fine-tuning your own models, deploying them, etc., puts you back into the micro-market of Innovators who prefer to build instead of buy.

Note that I am not saying that no one should be fine-tuning and deploying their own LLMs. Under some very specific circumstances (which are few) it does make sense to do so. But in almost all of those circumstances you will find yourself in the Innovators and Early Adopters groups, and those groups will just use open source tools and not pay a vendor for the benefit.

Skipping the chasm in the consumer segment

OpenAI plays in two very different segments: The developer segment discussed above is served with APIs and dedicated compute capacity. ChatGPT and its mobile Apps, on the other hand, are very much “consumer” products. ChatGPT is famously one of the fastest products to reach 1M users (in 5 days) and, while there is no official breakdown of OpenAI’s revenue numbers, one estimate puts revenue from mobile apps at $3M per month. Doesn’t sound like a product that slowly grew through the early market, does it?

Despite its jargony name (GPT stands for Generative Pretrained Transformer), ChatGPT skipped right to the Early Majority, mostly due to its friendly and easy-to-use form factor. Anyone, from journalists to teachers or students, could access it for free and immediately experience the value. If OpenAI had just released a model that engineers could call through REST APIs, it wouldn’t have led to the massive amount of adoption by the mainstream.

Most executives would tell you that splitting your focus between two wildly different segments is generally a bad idea. However, I’d argue that the broad success of ChatGPT with consumers was instrumental in driving demand in the developer segment. It turns out that developers and enterprise buyers are also human. They read the news, follow trends, and try out consumer products. OpenAI, whether intentional or not, benefited from this in several ways.

Most obviously, awareness and brand recognition is critical to any business. Although OpenAI and LLMs were already well known within the AI crowd, it took ChatGPT to make it a brand name for the Early Majority in the broader developer segment.One of the reasons the “chasm” exists in the first place is that the Early Majority is usually risk averse and doesn’t trust signals from Innovators. One way to overcome this is by providing them an easy way to experience the product. ChatGPT provided the perfect “free trial” experience for non-technical decision makers in the Early Majority.The more “typical” way to achieve revenue growth like OpenAI is to hire an enterprise sales team. It turns out that the traditional Sales-Led-Growth (SLG) motion benefits significantly from tried-and-true PLG methods like seamless access to experience a product. Enterprise buyers increasingly expect to “see and experience the product’s value before committing to a large contract”.

Conclusion

I started this post as a natural sequel to my previous posts on developer tools for ML because I am seeing the MLOps story repeat itself with LLMOps. But, as I wrote about how LLMs skipped the chasm, I realized that the lessons may be applicable more broadly.

For LLM providers like OpenAI, Anthropic, et al.: I am not sure if these companies stumbled upon this strategy accidentally, but, if applied intentionally, there are definitely lessons on how to improve both product development and GTM. However, if you are in hypergrowth mode there is little time or need for optimization.

For anyone in the LLMOps ecosystem: I invite you to read my previous posts on ML infrastructure and you will see why I believe there won’t be much value extraction at this layer. Additionally, I believe there are very few cases where fine-tuning LLMs actually makes sense, but others have already written plenty about this.

For tech startups in general: I’ve seen massive funding rounds in open source based startups where the hypothesis was either “managed OSS + scale, stability, and security” or “open core and we’ll figure out monetization later”. I believe that the idea of “skipping the chasm” is valuable here and look forward to feedback from both founders and investors!

Opinions expressed in this post are my own and not the views of my employer.

Clemens is an entrepreneurial product leader who spent the last 8+ years bringing AI to developers and enterprises.

Crossing the AI Chasm: How OpenAI turned LLMs into a mainstream success was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Comment