LLM Development Landscape

Kamolphan Liwprasert

October 06, 2024

60

LLM Development Landscape

LLM Development Landscape
Presented at Data + AI Day 2024
6th October 2024

Kamolphan Liwprasert

October 06, 2024

Tweet

More Decks by Kamolphan Liwprasert

See All by Kamolphan Liwprasert

Keep the Cost Down: A Review on Methods to Optimize LLM’s KV-Cache Consumption

0

60

ML Engineering & MLOps - Data Cafe

0

48

LLM Development Knowledge Sharing

0

17

Build with AI - Get started with the Gemini API

0

58

Data Engineering behind Data Science World

0

27

Diversity in AI

0

340

ML Engineering: from model to production

0

69

Generative AI and Similarity Search with Vertex AI Matching Engine

0

210

La Kopi & WDA4 - Developing a data pipeline on cloud

0

200

Other Decks in Technology

See All in Technology

Kubernetesで作るAIプラットフォーム

oracle4engineer

2

250

Claude Code どこまでも/ Claude Code Everywhere

31

13k

Securing your Lambda 101

0

240

Roo CodeとClaude Code比較してみた

1

280

Tensix Core アーキテクチャ解説

tenstorrent_japan

0

340

「どこにある？」の解決。生成AI(RAG)で効率化するガバメントクラウド運用

2

300

Test Smarter, Not Harder: Achieving Confidence in Complex Distributed Systems

1

160

ハッカー視点で学ぶサイバー攻撃と防御の基本

3

1.8k

Text-to-SQLの評価データセットを作って最新LLMモデルの性能評価をしてみた

3

770

What's new in OpenShift 4.19

redhatlivestreaming

1

180

AWS と定理証明〜ポリシー言語 Cedar 開発の舞台裏〜 #fp_matsuri / FP Matsuri 2025

9

2.3k

Workflows から Agents へ ~ 生成 AI アプリの成長過程とアプローチ~

2

130

Featured

See All Featured

Building a Scalable Design System with Sketch

462

33k

Being A Developer After 40

90

590k

Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]

4

130

Helping Users Find Their Own Way: Creating Modern Search Experiences

29

2.6k

Building Flexible Design Systems

yeseniaperezcruz

328

39k

Rebuilding a faster, lazier Slack

81

9k

How STYLIGHT went responsive

100

5.6k

Music & Morning Musume

46

6.6k

Documentation Writing (for coders)

71

4.9k

Mobile First: as difficult as doing things right

223

9.6k

Designing Dashboards & Data Visualisations in Web Apps

231

53k

Building Better People: How to give real-time feedback that sticks.

367

19k

Transcript

LLM Development Landscape Kamolphan Liwprasert (Fon) MLOps Consultant, AIMET.tech Google
Developer Expert - Cloud
แนะนําตัว
AIMET aimet.tech
LLM Development Landscape ✨ Overview ภาพรวมในการพัฒนาแอป LLM ✨ Concept ที่น่ารู้เกี่ยวกับ
LLM ✨ Dev Application LLM อย่างไรได้บ้าง ✨ มี framework อะไรให้เลือกใช้บ้าง
None
None
None
Overview ภาพรวม การพัฒนาแอป LLM
นิยาม LLM A large language model (LLM) is a computational
model capable of language generation or other natural language processing tasks. https://en.wikipedia.org/wiki/Large_language_model
นิยาม Multimodal LLM Multimodal = characterized by several diﬀerent modes
of activity or occurrence. https://research.google/blog/multimodal-medical-ai/
เราเรียกใช้ AI Model อย่างไรได้บ้าง?
Model Serving Application 📱 💻 🌐 Model 🤖 API API
= “Client - Server” Client Server
Application 📱 💻 🌐 Model On-Device AI = “Edge”
Challenge?
None
Language Model APIs
🏆 LMSYS Chatbot Arena Leaderboard https://chat.lmsys.org/?leaderboard
Artiﬁcial Analysis: เว็บเปรียบเทียบ AI models https://artificialanalysis.ai/
Artiﬁcial Analysis: Quality vs Price https://artificialanalysis.ai/
Artiﬁcial Analysis: API Prices https://artificialanalysis.ai/
Services to Host Language Models
Why self-host LLM? 💲 Cost eﬀicient in long term (ie.
on-premise) → Need to tune the latency to make the model faster ⚙ Customization & fine-tuning → No lock-in to a particular model 🔒 Security compliance & data residency / privacy
Run LLM locally LlamaFile github.com/Mozilla-Ocho/llamaﬁle Ollama ollama.com/ LM Studio lmstudio.ai/
LLM Development Frameworks
LangChain 🦜🔗 Python / JS library framework for developing applications
powered by large language models (LLMs). https://www.langchain.com/langchain
LlamaIndex Turn your enterprise data into production-ready LLM applications. (Python
/ Typescript) https://www.llamaindex.ai/
Semantic Kernel from Microsoft Semantic Kernel is an SDK that
integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. https://github.com/microsoft/semantic-kernel
It’s ﬁne not using any of these frameworks 󰙤
RAG Concept :Retrieval Augmented Generator
RAG - Ask→ Retrieve from DB → Generate Answer
Document Search example Vector DB
Vector Database https://www.graft.com/blog/top-vector-databases-for-ai-projects
RAG vs Fine-tuning
Agentic Workﬂow Agentic = behaves like an agent
Why Agentic? https://www.vellum.ai/blog/agentic-workflows-emerging-architectures-and-design-patterns
Agentic Workﬂow https://www.vellum.ai/blog/agentic-workflows-emerging-architectures-and-design-patterns
Crew AI https://www.crewai.com/
https://github.com/microsoft/autogen
Azure: Copilot Studio GCP: VertexAI Agent Builder
Inference / Serving
Text Generation Inference https://huggingface.co/docs/text-generation-inference/index
vLLM = Model serving for LLM Easy, fast, and cheap
LLM serving for everyone vLLM is fast with: ✅ State-of-the-art serving throughput ✅ Eﬀicient management of attention key and value memory with PagedAttention ✅ Continuous batching of incoming requests ✅ Fast model execution with CUDA/HIP graph ✅ Quantization: GPTQ, AWQ, SqueezeLLM, FP8 KV Cache ✅ Optimized CUDA kernels https://github.com/vllm-project/vllm Throughput: Higher is better
Responsible AI
https://ai.google/responsibility/responsible-ai-practices/
Google's Secure AI Framework https://safety.google/cybersecurity-advancements/saif/
Responsible AI ✅ ตรวจสอบความถูกต้องเสมอ ✅ Human-centered Design ออกแบบสําหรับคนใช้ ⚠ ระวังเรื่อง
Data Privacy ความเป็นส่วนตัวของข้อมูล ⚠ Biases and Fairness ทําให้มีความเป็นธรรมกับผู้ใช้
Resources
https://www.promptingguide.ai/
Sunday 3 November 2024 @ K+ Building Samyan Register now:
bit.ly/devfest-cloud-bkk24 Saturday 26 October 2024 @ Cleverse Register now: bit.ly/technologista-2024 ฝาก event :) Technologista By PyLadies x Women Techmakers DevFest Cloud Bangkok By GDG Cloud Bangkok
LLM Development Landscape Kamolphan Liwprasert (Fon) MLOps Consultant, AIMET.tech Google
Developer Expert - Cloud

	
		OSZAR »