What do Alpacas have to do with AI's future?
Could AI models become commodities? Microsoft does browsers with Web3 Wallets? Huh! Questions from the road ahead...
Hello! Welcome back to Cloud Vertigo. Our lifelong learning journey goes on with another AI chapter. New here? Great to have you on board! ⛵
Advancing is always a path. Sometimes we go forward, sometime drift backwards, sometimes we sit and stop.
If human knowledge seems to take the long and scenic routes, Artificial Intelligence advancements feel geodesic. They leave little to chance. A geodesic curve is the shortest path between two points on some surface.
In Euclidean geometry, the shortest path between two points is a straight line. However, when we do not have any idea about the surface, the shortest path needs not be straight. Most strikingly in some geometries the geodesic path does not need to be unique.
Today’s job is laying out some groundwork in the quest to figure out the AI supply chains of tomorrow. I feel there are multiple geodesic paths available. Let’s get started.
Competition around foundation models
The simplest way to start is always cost. We could assess the advancement of AI by its training cost. Prices always tell meaningful stories. In this case, it’s about the power struggle between the interface and the infrastructure players (cloud computing). Remember last week Not Boring sketch?
Until very little ago, AI models used to be prohibitively expensive. Yet, every day with new research, data and computing power being accessible, training costs decline. ARK researchers estimate a cost-reduction trajectory of around 70% yearly
AI training cost declines continued at an annual rate of 70%, the cost to train a large language model to GPT-3 level performance collapsing from $4.6 million in 2020 to $450,000 in 2022. We expect cost declines to continue at a 70% rate through 2030 (Summerling and Downing, ARK Research 2023)
We may be going at a much faster rate than that.
Last month, Meta has released open-source its own large language model (LLaMA). It is trained with trillion of tokens and has comparable performances to OpenAI. This is a significant move. It highlights that Meta’s is the only Big Tech player that does not have a cloud computing game. The devil is in the details: the access to LLaMA foundation model is permissioned and it has a heavy non-commercial clause.
The key message though is clear. Foundation models can become commodities.
Stanford researchers showed that you can take an open-source model such as Meta’s smallest model LLaMA 7B ( it has an estimated training cost at $ 85.000) and fine-tune it with the help of ChatGPT3.5 for 600$. They called it Alpaca. The research demonstrates that you can ask an AI to fine-tune your AI model. This makes the LLMs we have attained suprisingly replicable.
Although it is against OpenAI Terms of Services to use it to build a competing service, it still led some analysts to wonder whether Microsoft has significantly overpaid for OpenAI ($ 10 bn at a $ 29 bn valuation).
What does it mean? It’s the first time it seems plausible that AI can learn from other AI retaining comparable results (of course, unsupervised learning models are not news and have been extensively studied, but this is something slightly different). It speaks of how easy it is to transfer knowledge between models. The resulting copycat model is necessarily sub-par at first, but with subsequent independant reinforcement learning cycles, it could outperform the original model, especially on domain specific tasks.
This could pose a challenge to ARK researchers thesis that AI creating unprecedente demand for training data and taht “high-quality domain-specific AI training data could result in winner-takes-most outcomes across vertical applications”. It’s not data that’s valuable, it’s use!
This is will be a crucial question for Microsoft Enterprise AI play’s defensibility.
The enterprise question will be around the make or buy customers’ choice. Why would customers buy Micorsoft OpenAI LLMs capabilities off the shelf ? The alternative is building from a free Meta model, purchase the first fine-tuning batches and then improve it rent-free with the subsequent internal model interactions. This way customers would retain the intellectual property over the subsequently generated training epochs and develop their own engine. However, they would miss out on the general improvements that come from a broader exposure to the public.
It’s not a new tradeoff: privacy vs efficiency. What is new is the scale of the implications.
Last week highlights
Midjourney v. 5 has reached incredible photorealistic capabilities. This project is getting so many things right. Just check it out!
ChatGPT4 is out and supports multi-modal input (you can feed it your drunk napking drawings and ask back a fully functioning website). I have this great idea, please execute. Didn’t we all dream of this?
Microsoft Edge is working on a an integrated Web3 Wallet. Designs have leaked. With new browser wars on the horizon, this is a strong signal of concrete steps by the incumbents towards web3.
Newest in the gauntlet of questionable upcoming Microsoft Edge features, a crypto wallet 💸 Not really sure how to feel about this kind of thing being baked into the default browser, what are your thoughts? More screenshots of the UI in the next tweet ➡️Disagree? Have a further topic that keeps you awake at night? Come say hello to david@cloudvertigo.xyz.