All Courses

Getting Started with Local LLMs

Chapter 1: Introduction to Large Language Models

What Is a Large Language Model (LLM)?

A Simple View of How LLMs Work

Understanding Tokens and Text Generation

Why Run LLMs Locally?

Local vs. Cloud-Based LLMs

Quiz for Chapter 1

Chapter 2: Preparing Your Local Environment

Hardware Considerations: CPU

Hardware Considerations: RAM

Hardware Considerations: GPU and VRAM

Checking Your System Specifications

Operating System Compatibility

Installing Python (Optional but Recommended)

Introduction to the Command Line / Terminal

Quiz for Chapter 2

Chapter 3: Finding and Selecting Local LLMs

Where to Find LLM Models: Hugging Face Hub

Understanding Model Sizes and Parameters

Model Formats: GGUF and Others

Quantization: Making Models Smaller

Reading Model Cards for Information

Model Licenses and Usage Restrictions

Choosing Your First Model

Quiz for Chapter 3

Chapter 4: Running Your First Local LLM

Introduction to Local LLM Runners

Setting up Ollama

Downloading a Model with Ollama

Running a Model with Ollama (Command Line)

Setting up LM Studio

Finding and Downloading Models in LM Studio

Loading and Chatting with a Model in LM Studio

Introduction to llama.cpp (Concept)

Hands-on Practical: Running a Model

Quiz for Chapter 4

Chapter 5: Basic Interaction and Prompting

What is a Prompt?

Your First Prompt: Simple Questions

Giving Instructions

Understanding Context Window

Basic Prompt Formatting Tips

Temperature and Creativity

Common Interaction Patterns

Practice: Simple Prompting Techniques

Quiz for Chapter 5

What Is a Large Language Model (LLM)?

Was this section helpful?

References

Speech and Language Processing, Daniel Jurafsky and James H. Martin, 2025 - A comprehensive, continuously updated textbook providing foundational knowledge of natural language processing, including detailed explanations of language models.
Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin, 2017 Advances in Neural Information Processing Systems 30, Vol. 30 (Curran Associates, Inc.) - Introduces the Transformer architecture, a core innovation enabling the scale and capabilities of modern Large Language Models through its attention mechanism.
Language Models are Few-Shot Learners, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei, 2020 arXiv DOI: 10.48550/arXiv.2005.14165 - Presents GPT-3, a landmark Large Language Model, demonstrating how extreme scale in parameters and training data leads to impressive few-shot learning and diverse language generation capabilities.

© 2025 ApX Machine LearningEngineered with