LLM-based System Architecture

From GM-RKB
Jump to navigation Jump to search

An LLM-based System Architecture is an AI system architecture for an LLM-based system.



References

2023

  • (Bornstein & Radovanovic, 2023) => Matt Bornstein and Rajko Radovanovic. (2023). “Emerging Architectures for LLM Applications."
    • QUOTE: ... a reference architecture for the emerging LLM app stack. It shows the most common systems, tools, and design patterns we’ve seen used by AI startups and sophisticated tech companies. ...
    • Here’s our current view of the LLM app stack:

    • SUMMARY:
      • Data Pipeline Platform: Platforms designed for the efficient management and transfer of data from one point to another in the data processing cycle. They ensure seamless data integration, transformation, and storage, such as: Databricks, and Airflow.
      • LLM Embedding Model Platform: Platforms specializing in generating embeddings, typically numerical vectors, from raw data like text, images, or other forms of unstructured data. These embeddings are useful for machine learning tasks and data similarity searches, such as: OpenAI, and Hugging Face.
      • Vector Database Platform: Databases specifically designed to handle high-dimensional vectors generated by embedding models. They offer specialized algorithms for searching and retrieving vectors based on similarity metrics, such as: Pinecone, and Weaviate.
      • LLM-focused Playground Platform: Platforms that provide an interactive interface for users to experiment with code, models, and data. These platforms are useful for prototyping and testing, such as: OpenAI Playground, and nat.dev.
      • LLM Orchestration Platform: Systems that manage, schedule, and deploy various tasks in a data workflow. They ensure that data and computational resources are used effectively across different stages of a project, such as: Langchain, and LlamaIndex.
      • APIs/Plugins Platform: Platforms that offer a variety of APIs and plugins to extend functionality, integrate services, or improve interoperability between different technologies, such as: Serp, and Wolfram.
      • LLM Cache Platform: Platforms focused on caching data and responses for Language Learning Models (LLMs) to improve performance and reduce latency during inference, such as: Redis, and SQLite.
      • Logging / LLM Operations Platform: Platforms for logging and monitoring the activities of Language Learning Models (LLMs). They provide insights into performance, usage, and other operational metrics, such as: Weights & Biases, and MLflow.
      • Validation Platform: Platforms dedicated to ensuring data quality and model accuracy. They perform checks and validations before and after data processing and model inference, such as: Guardrails, and Rebuff.
      • App Hosting Platform: Platforms that offer cloud-based or on-premise solutions for hosting applications. These can range from web apps to machine learning models, such as: Vercel, and Steamship.
      • LLM APIs (proprietary) Platform: Platforms offering proprietary APIs specifically for Language Learning Models. They may offer added functionalities and are usually tailored for the vendor’s specific LLMs, such as: OpenAI, and Anthropic.
      • LLM APIs (open) Platform: Open-source platforms that offer APIs for Language Learning Models (LLMs). These APIs are usually community-supported and offer a standard way to interact with various LLMs, such as: Hugging Face, and Replicate.
      • Cloud Providers Platform: Platforms that provide cloud computing resources. These platforms offer a variety of services such as data storage, compute power, and networking capabilities, such as: AWS, and GCP.
      • Opinionated Cloud Platform: Platforms that offer cloud services with built-in best practices and predefined configurations, reducing the amount of customization needed by the end-user, such as: Databricks, and Anyscale.

2023