Understanding Generative AI: A Tech Stack Breakdown

The rise of Generative AI (Gen AI) has captured the attention of businesses worldwide, with everyone eager to build something and gain competitive advantage. Leveraging this transformative technology, however, requires embracing an entirely new tech stack of technologies, tools, and frameworks. Understanding all the intricacies involved can help companies prioritize the right use cases, choose the right technologies, and maximize the value of Gen AI.

How Generative AI Works

Generative AI refers to the various AI techniques, tools, and models designed to generate entirely new content based on an input. It can understand and create not just text but also images, video, audio, and other modalities.

While it’s a subset of Artificial Intelligence (AI), what sets Gen AI apart is the Foundation Model (FM). FMs are Machine Learning (ML) models pre-trained on vast amounts of data across modalities. Foundation Models are generalized. Unlike earlier forms of AI, FMs are not trained for specific tasks and can be adapted to solve a wide range of problems.

FMs are based on a Deep Learning neural network architecture called Transformers. Transformer models can absorb large amounts of text content and can understand the relevance and context of every word in a sentence, paragraph, or article. It can understand relationships between words.

Foundation Models are built in two steps:

Step 1: Pre-trained on vast amounts of raw, generalized, and unlabeled data across modalities such as text, images, audio, video, etc. This training is largely unsupervised. FMs have a large number of tunable parameters, which allows them to comprehend complex topics.
Step 2: Fine-tuned to specific tasks like Question/Answer, Summarization, Sentiment Analysis, etc. FMs are generalized and can be adapted to solve a wide range of tasks.

The most mature FMs today are text-focused mainly because of the mountain of textual content already available for training. This has accelerated the development of a specific type of FM for language-specific tasks: Large Language Models (LLM).

The Generative AI Technology Stack in the Enterprise Context

Generative AI is not just another application you build. It brings along an entirely new tech stack.

While Foundation Models are at the heart of the stack, there are several new layers that enterprises need to build, buy, or just be aware of depending on what they are trying to achieve. The tech stack comprises several new tools, technologies, and techniques organized into several distinct layers.

1. Compute

At the bottom of the stack are the compute hardware chips required for model training and inference. In recent years, raw computing power from Graphics Processing Units (GPUs) has increased, with processing efficiency doubling every 18 months.

Nvidia is the leader in GPUs, but AMD and Intel are also coming out with their GPUs and associated developer tools. Google introduced Tensor Processing Unit (TPU) chips, specifically designed to handle large-scale ML deployments. Startup SambaNova has built proprietary hardware including processor and memory chips, and software designed to run the very large LLMs.

Companies that decide to build their own Foundation Models or fine-tune existing FMs for their domain or use case will need to work with AI hardware accelerator firms that combine hardware and software for a lower total cost of ownership. All other companies will leverage existing FMs hosted by the cloud providers and do not have to deal with this layer of the stack.

2. Cloud Platforms

Behind the scenes, infrastructure vendors play a pivotal role in Gen AI solutions. This includes cloud hyperscalers (Amazon AWS, Microsoft Azure, Google GCP) that not only provide the storage and computational resources required in analyzing massive amounts of data, but also provide models, full-stack tools, and services for building Gen AI applications. Currently, the offerings from the cloud providers are very similar, but differentiation will come over time.

3. Foundation Models

Rather than building a Foundation Model from scratch, most enterprises will choose an existing FM that suits their needs. There are open-source options such as Falcon and Llama 2 as well as commercial, closed-source models available as APIs such as OpenAI, A121labs, and ANTROP\C.

There are several factors involved in selecting FMs, such as the number of tunable parameters, context window size, output quality, inference speed, cost, fine tunability, security & privacy needs, and license permission. FMs can be accessed through APIs by the model providers, or you can download the model and self-host it in your own infrastructure (on cloud or on-premises).

4. Fine-Tuned Models

If the accuracy of FMs is not good enough, then you should consider fine-tuning or customizing an existing FM. Fine-tuning is the process of adjusting the parameters of an existing model by training it on your enterprise dataset to build “expertise” for your specific use case or domain.

5. MLOps (or LLMOps)

Machine Learning Operations (MLOps) has always existed in the traditional Machine Learning world. MLOps with Generative AI, also referred to as LLMOps, is more complex mainly due to the large scale and size of the models involved. LLMOps involves activities such as selecting a foundation model, adapting this FM for your use case, model evaluation, deployment, and monitoring.

Adapting a Foundation Model is done mainly through prompt engineering or fine-tuning. Fine-tuning brings in the additional complexity of data labeling, model training, and model deployment to production.

Several tools have emerged in the LLMOps space. There are point solutions for experimentation, deployment, monitoring, observability, prompt engineering, governance, etc., as well as tools that offer end-to-end LLMOps.

6. Data Platforms and Management

Data is the lifeblood of Gen AI. The better the data used to provide context or train and fine-tune Foundation Models, the better the outcomes. Roughly 80% of the time spent in Gen AI development is to get data to the right state: data ingestion, automating data pipelines, cleaning, data quality, vectorization, and storage.

Many organizations already have a data strategy for structured data, but Generative AI can take that a step further and unlock value from unstructured data. You need an unstructured data strategy to align with your Gen AI strategy.

7. Application Experience

At the top of the stack are applications that integrate Gen AI models into a great user experience. These applications can use one or more LLMs or a combination of models working together to solve different problems and deliver a holistic experience.

Some notable examples include Midjourney, an AI image generator, and GitHub Copilot, an AI pair programmer. GitHub Copilot is a cloud-hosted application based on a modified version of the GPT-3 FM. Further, the model is fine-tuned on billions of lines of open-source code repositories like GitHub.

What This Means for Companies

Early adopters of Generative AI stand to gain a significant competitive advantage. We recommend that you take a business use case-driven approach and prioritize high-value, strategic use cases. You can identify use cases for internal departmental productivity improvements or improve your products for competitive advantage by infusing Gen AI as features in existing products.

Many of our customers are starting with non-core use cases (e.g. IT Support chatbot) as an experiment to boost confidence. Orion can help you identify and prioritize use cases and rapidly build a proof of concept to demonstrate Gen AI value in your organization.

Orion has been co-innovating with businesses to inspire and accelerate digital innovation for 30 years. Learn more about our Generative AI expertise here.