Local LLMs and ellmer Extensions



Grayson White

Math 241
Week 11 | Spring 2026

Announcements

  • Fill out this form to help me create the Project 2 groups.
    • Groups and project instructions will be assigned on Monday.

Week 11 Goals

Monday Lecture:

  • Practice with Local LLMs
  • Extensions of ellmer

Wednesday Lecture:

  • Intro to writing R packages

Local LLMs

Local LLMs

  • Last time, we discussed the ethics of LLMs and showed some examples of using LLMs for data science.

  • We considered local vs cloud-based implementations of LLMs

  • Local LLMs have a lot of advantages:
    • More secure
    • Smaller and less power intensive (not sending compute requests to data centers)
    • No rate limits
  • …But they also have their disadvantages:
    • Can take longer to run depending on your hardware
    • Not as “good” as larger models
  • Today, we’ll explore the data science use-cases of local LLMs.
    • and use them to power ellmer and ellmer extensions!

Ollama

  • Ollama is the primary host of open-weights models

  • You can download these models, and then connect them to ellmer with chat_ollama()

  • You can also access cloud-based models through ollama, but these get rate-limited.

  • Now, let’s talk about how to install Ollama and local models with it.

System check

  • In order to know what sorts of models your computer can handle, you’ll need to look see how much RAM your computer has.
  • Check RAM:
# install.packages("memuse")
memuse::Sys.meminfo()
Totalram:  24.000 GiB 
Freeram:    9.688 GiB 
  • 24gb RAM –> (approximately up to) 24 billion parameter model
    • “24 billion parameters” is often abbreviated with 24B.
    • Not always best to run the biggest model your computer can handle: will slow down other processes (and often be slow itself!)
  • As long as you have 8gb+ you will be okay for today! 16gb+ is ideal.

Install Ollama

  • Ollama is the service that allows us to easily download models
    • Installing Ollama is analogous to installing git,
    • The Ollama website is analogous to GitHub.com, and
    • The models are hosted just like repos are hosted on GitHub.com.
  • To install Ollama, paste this in your terminal:
curl -fsSL https://ollama.com/install.sh | sh


After installation, if you run

which ollama

in your terminal, you should see something like:

/usr/local/bin/ollama

Install our first local model

Let’s navigate to ollama.com and install a model.


Install models by running the following in your terminal:

ollama pull {model_name}

e.g.,

ollama pull gemma4:e4b

Connect local models with ellmer

library(ellmer)
chat <- chat_ollama(model = "gemma4:e4b")

Connect local models with ellmer

Your turn:

Complete the task from last time: parse names and ages (but now with a local LLM of your choice)

type_person <- type_object(
  name = type_string(),
  age = type_number()
)

prompts <- list(
  "I go by Alex. 42 years on this planet and counting.",
  "Pleased to meet you! I'm Jamal, age 27.",
  "They call me Li Wei. Nineteen years young.",
  "Fatima here. Just celebrated my 35th birthday last week.",
  "The name's Robert - 51 years old and proud of it.",
  "Kwame here - just hit the big 5-0 this year."
)

parallel_chat_structured(
  chat,
  prompts,
  type = type_person
)
# A tibble: 6 × 2
  name     age
  <chr>  <dbl>
1 Alex      42
2 <NA>      NA
3 Li Wei    19
4 Fatima    35
5 Robert    51
6 Kwame     50
  • Sometimes, really small models (like gemma4:e4b) really are sub-par…
10:00

The system.time({}) factor…

On my M4 Pro Macbook Pro with 24GB RAM,

chat_local <- chat_ollama(model = "gemma4:e4b")
system.time({
  parallel_chat_structured(
  chat_local,
  prompts,
  type = type_person
)
})
   user  system elapsed 
  0.068   0.020  52.077 


  • When not rate-limited, this request on a cloud-based LLM should only take ~1 second.

Review

  • Now, we can chat with cloud-based and local LLMs in R via ellmer

  • We can even structure the type of data we’d like to get back!

  • And create tools, which let the LLMs perform specific tasks related to our computer or R session.

  • To make this even more powerful: need to give the LLMs more context about our R environment.

    • Ideas?
    • We could write a bunch of a tools…
    • Or…

ellmer Extensions

ellmer Extensions

  • A team at Posit has been working to make interesting R packages that allow you to interact with LLMs in R

  • Today, we’ll explore a couple of these R packages

btw: Provide context to LLMs




  • btw provides context about your R environment to LLMs through a variety of mediums:
    • Quickly copy context to your computer’s clipboard,
    • Through an interactive chat in your IDE, and
    • In ellmer or ellmer-like chats.

btw: copy-paste context

Provide context for a data.frame:

park_trees <- pdxTrees::get_pdxTrees_parks()

# install.packagaes("btw")
library(btw)


btw(park_trees)
#> ✔ btw copied to the clipboard!


Provide context for a data.frame, and documentation:

btw(park_trees, ?pdxTrees::get_pdxTrees_parks)
#> ✔ btw copied to the clipboard!


Even provide the question you’d like to ask.

btw(park_trees, ?pdxTrees::get_pdxTrees_parks, "What does the Crown_Width_NS variable measure? And in what units?")
#> ✔ btw copied to the clipboard!

Pass context to ellmer chats with btw

chat <- chat_ollama(model = "gemma4:e4b")

chat$chat(
  btw(park_trees, 
      ?pdxTrees::get_pdxTrees_parks, 
      "What does the Crown_Width_NS variable measure? And in what units?")
)
Based on the documentation provided for the `pdxTrees_parks` dataset, the 
variable **Crown_Width** measures:

*   **The width of the tree canopy from North to South** (North to South 
width).

While the documentation does not explicitly state the units (e.g., feet, 
meters), this variable represents the physical horizontal dimension of the 
tree.
  • Passing the direct context you’d like to reference can be really helpful for getting quick responses from local LLMs,

  • But if want btw to automatically find your context…

Provide context in ellmer with btw

chat <- chat_ollama(model = "gemma4:e4b")

chat$register_tools(
  btw_tools("env")
  # alternatively, call:
  # btw_tools()
  # to allow btw to use all of its built-in tools
)

chat$chat("What is in my global environment?")
#> ◯ [tool call] btw_tool_env_describe_environment(`_intent` = "list global environment
#> contents")
#> ● #> ## Context
#>   #>
#>   #>
#>   #> park_trees
#>   #> ```json
#>   #> …
#> The global environment contains one data frame named `park_trees`.
#> 
#> This data frame contains **25,534 rows** and **34 columns**. It appears to be an inventory of
#> ...
  • Providing a tool group (in this case: “env”) stops the local model from getting overwhelmed by too much context.

  • Let’s take a look at the documentation for btw_tools().

There’s also an interactive IDE chat

chat <- chat_google_gemini()
btw_app(client = chat)

gander

  • …also provides context to LLMs in R



gander is a higher-performance and lower-friction chat experience for data scientists in RStudio and Positron–sort of like completions with Copilot, but it knows how to talk to the objects in your R environment.

The package brings ellmer chats into your project sessions, automatically incorporating relevant context and streaming their responses directly into your documents.

  • Let’s take a look!

Installing gander

# first, install `gander`:
install.packages("gander")

Next, pick a model to power gander:

(a): for your current R session,

options(gander.chat = ellmer::chat_ollama(model = "gemma4:e4b"))


or (b): as your default for all R session:

usethis::edit_r_profile()
# and then paste: 
# options(gander.chat = ellmer::chat_ollama(model = "gemma4:e4b"))
# into the loaded file. Restart R for this to take effect.

Installing gander

Finally, set up a shortcut to gander:

(a): In RStudio, navigate to Tools > Modify Keyboard Shortcuts > Search "gander". The package authors suggest Ctrl+Alt+G (or Ctrl+Cmd+G on macOS), or


(b): In Positron, you’ll need to open the command palette, run “Open Keyboard Shortcuts (JSON)”, and paste the following into your keybindings.json:

    {
        "key": "Ctrl+Cmd+G",
        "command": "workbench.action.executeCode.console",
        "when": "editorTextFocus",
        "args": {
            "langId": "r",
            "code": "gander::gander_addin()",
            "focus": true
        }
    }

gander demo

Some thoughts

  • Local models provide a more secure (and more ethical?) way to interact with LLMs in R

  • In their current state, local models are still slow, especially when given excessive of context.

  • Providing context to LLMs (local or cloud-based) through tools like btw and gander can make them much more useful in a data science workflow.

  • In Problem Set 6, you’ll get to reflect on LLM-usefulness for debugging, compared to the other debugging approaches we’ve learned so far.

Your turn!

In groups of 2 - 3, go through the activity for today.

30:00

Next time

  • Learn to write our own R packages!