Shared AI Recipes as Public Innovation Infrastructure

Audience:

The full version of this paper is here and frankly it will be easier to read: https://docs.google.com/document/d/12z4hDY_WnobCJL-cjy0rRyjc9qTee7VPydUK...

Abstract

How does every org become an AI organization (so they don’t get displaced by another org that does)? How can we leverage the constant advances in both open and private AI models, making them cheaply available and ready for impact evaluation in an organization’s specific use case? How do organizations discover and apply the hard-won AI lessons of their field’s peers to their own problems?

We propose that an open source AI orchestration and hosting service would greatly accelerate the deployment, iteration and testing of AI-based solutions for enterprises and development organizations. The system should expose most of the world’s prominent public and paid AI services; provide a hosting infrastructure to run open source AI models; foster an ever-growing collection of simple, reusable AI workflows; provide meta-tools such as feedback, LLM analysis and billing systems; and be both hosted as a reliable cloud service or available to run on private, enterprise or government compute systems. With such an infrastructure, one-off AI investments become new assets for the public to reuse, thereby enhancing the speed of innovation.

Why now?
“It’s not that AI will replace lawyers; it’s that the lawyers who use AI will replace those that don’t.” - Superlegal

“Every business will become an AI business.” -Satya Nadella

It is clear to us and many others that the productivity enhancements made possible first with software - with its feature that humans can reuse and modify prior investments at near- zero marginal cost - and now modern AI tools such as OpenAI’s ChatGPT will likely cause a transformation in how most processes in organizations function.

But how can we help people and organizations make this transition to a hyper-competitive and productive world? What tools do we need when “thinking” jobs are ones for AI prompt writers - trying to wrangle the AI to our desires, and/or API stitchers - connecting non-obvious or custom sets of data and functionality to build novel and useful new things? How do we specifically help organizations - such as development organizations - learn from each other’s investments?

Stories of AI
To better understand why an infrastructure like Gooey is needed, we need to understand the needs of organizations that could benefit from it. Here we present three organizations, who are all attempting to build AI chatbots.

Digital Green
Digital Green is an NGO that helps over 4 million smallholder farmers increase their productivity. For the last 20 years, they have sought out and filmed best practices from these farmers and then distributed their expertise through village-level screenings and more recently, 80M+ YouTube views. They have recorded over 10,000 videos and operate in 10 states in India, Ethiopia and Kenya. Like many established organizations, they’ve created an incredible repository of wisdom. Ideally, we’d like this wisdom to be available instantly to every farmer and the extension agents who help mentor farmers and convince them to take a risk on a new best practice and/or grow a more climate change resistant crop. Such a solution would ideally cost almost nothing on a per user basis, be available whenever the agent or farmer needed help, incorporate all of the latest agricultural research, science and real-time weather and soil sensor data, give fluent advice in any language, dialect or literacy level and have a robust feedback system so we could measure the quality of its advice, whether it was being followed and how much it impacted farmers.

Noora Health
Noora Health works in 400 hospitals across India, Indonesia and Bangladesh and provides training for families who are just leaving the hospital after giving birth or undergoing a surgery. They built a successful BPO of nurses and doctors who provide WhatsApp based advice to family members and patients on a wide variety of topics. They’ve answered over 30,000 questions (about 200 / day) but now want to scale their services at near-zero marginal cost to 70M people.

Their AI Shared Problems
Each of these organizations is investing in LLMs and AI chatbots to provide better & cheaper services for their users. This process today is difficult and expensive on several fronts.

Technology understanding: There is an ever increasing set of AI technologies that organizations could leverage to solve their problem. DigitalGreen and Noora have all hence tasked their senior engineers to explore the potential solution space - including LLMs like OpenAI’s GPT, vectorDBs for storing larger knowledge bases, translation, speech recognition and text-to-speech AI models to support low-resource language users. But the AI space is vast, constantly changing and growing in capability every day as both the largest tech companies and best funded startups release new tools. It’s incredibly difficult to keep up with these innovations, let alone understand their relative trade-offs without actually building a prototype. Hence, DigitalGreen and Noora all funded internal prototyping projects - often stretching into months. These prototypes often require technical AI knowledge and these developers are among the most expensive right now.

Time to test: Given the complexity and newness of these systems, prototypes that are testable with real users often take months to build.

Cost of deployment: AI chat prototypes that use just an LLM such as GPT-3.5 are fairly cheap to deploy today but also have limited functionality. For example, if orgs want to employ a fine-tuned, open source AI model that promises to offer better speech recognition in a low-resource language, these models often require the most expensive computers available today.

Cost to prove user value: Simply building an AI demo or prototype doesn’t prove it’s useful to help users achieve their goals. Hence, feedback, cohort and usage analysis systems must be also created and then the prototype must be deployed at reasonable scale inside apps or via integrations such as WhatsApp or Slack.

As we can see with just the chat use cases here, building viable systems with proven user value is hard and expensive and we believe a platform like Gooey can help.

Platform Principles
Together with our influences (see Appendix), we hold these principles in mind as we design the platform.

Learn from others
Most organizations don’t have or can’t afford AI researchers on their staff but they could certainly benefit from knowing which AI initiatives of their peers are working best and ideally, can apply those initiatives quickly and cheaply to their own particular domains.

Keep Abstracting
We can offer the greatest leverage to our customers by building on top of the constantly expanding foundational AI ecosystem. All of the tech biggest players - MSFT/OpenAI, Google, AWS - are competing for developers to integrate with their respective technologies while the open source is also constantly releasing new innovations. With Gooey, it should be easy for organizations to build on the best of private + open source models. Furthermore, as new innovations are made available in the market, organizations should be able to “swap” in the latest technology components and compare their relative price vs performance, without large up-front investments to experiment or deploy a potentially a game-changing model or API.

Encourage Sharing + Reuse
Libraries, the scientific peer review system, GitHub, open source and the Mosaic browser’s “View Page Source” all enhanced learning and innovation ecosystems by encouraging innovators to share their work and for viewers to understand it deeply and quickly. Hence, like GitHub, Gooey.AI workflows are public by default for others to discover and reuse them as we attempt to grow the ecosystem of creators building and sharing AI workflows. Being the website where great AI workflows are discovered increases our network value and hence, encourages more creators to join and strengthen the ecosystem.

Include Everyone - especially non-coders + non-English speakers
For organizations that work with marginal or non-English populations, they must be able to quickly run and assess the effectiveness of tools in resource-poor languages, often spoken by millions (not billions) of people whose documents do not dominate the content of the Internet. We will facilitate this by making the private and public models & APIs for low-resource languages available with numerous examples of how others use them and evaluation frameworks for organizations to easily benchmark which models perform best for their particular users’ data sets.

Diagram here: https://lh7-us.googleusercontent.com/xdJpyQsY3LVHGENDCWG168zV_s1kugtNTid...

Video here: https://www.loom.com/share/dbf28cd1616c411a9d6631be5eb5fcc1?sid=51ef3e21...

Room:
Room 107
Time:
Friday, March 15, 2024 - 10:00 to 11:00