After restarting the MCP server (or restarting Claude Desktop), paste each prompt below into Claude Desktop one at a time. Expected behavior is described after each prompt.
Test 1: Auth check
Prompt:
Are you connected to Brightspace?
Expected: Claude calls auth_status.
Response shows "authenticated": true. If false, run
bs_auth() in an interactive R session first.
Test 2: Dataset discovery
Prompt:
What datasets are available? List them for me.
Expected: Claude calls list_datasets.
You see a list of all BDS dataset names with descriptions. Should be
dozens of datasets.
Test 3: Keyword search
Prompt:
Find any datasets related to grades.
Expected: Claude calls search_datasets
with keyword “grade” (or similar). Returns a filtered list – should
include “Grade Results” and possibly others.
Test 4: Dataset description with column stats
Prompt:
Describe the Users dataset. What columns does it have?
Expected: Claude calls describe_dataset
with name “Users”. Response should show:
- Row count and column count
- Per-column summaries (not sample rows):
- Numeric columns show min/max/mean
- Character columns show n_unique and top 3 values
- Date columns show min/max date range
- Footer suggesting
execute_r
Test 5: Simple execute_r – scalar result
Prompt:
How many users are there in total?
Expected: Claude calls execute_r with
something like:
nrow(bs_get_dataset("Users"))Returns a single number. No raw data transfer.
Test 6: execute_r – data frame result
Prompt:
Show me the top 10 most common role names in the User Enrollments dataset.
Expected: Claude calls execute_r with a
dplyr pipeline like:
bs_get_dataset("User Enrollments") %>%
count(role_name, sort = TRUE) %>%
head(10)Returns a compact text table (not JSON, not thousands of rows).
Test 7: execute_r – persistent workspace
Prompt:
Now filter those enrollments to just Students and tell me how many there are.
Expected: Claude uses variables from the previous call (or re-loads and filters). The key test is that Claude can reference or build on prior work. Should return a count.
Test 8: Interactive Chart.js chart
Prompt:
Create an interactive bar chart showing enrollment counts by role.
Expected: Claude should:
- Call
execute_rto compute the counts (e.g.,count(role_name, sort = TRUE)) - Build a self-contained HTML string with Chart.js from CDN
- Write it with
write_chart(html, 'chart_name.html')to the output directory - Call
browseURL()to open it
The response should include the HTML file path. Opening it shows an interactive bar chart with tooltips and hover effects.
Test 8b: Static ggplot fallback
Prompt:
Use ggplot to create a bar chart of enrollment counts by role. Return the plot object.
Expected: Claude calls execute_r with
ggplot code (no ggsave()). The server detects the ggplot
object and saves it as PNG + HTML viewer. Response should include:
- A text line like “Plot saved: Enrollments by Role (1 layer)”
- File paths for the PNG and HTML viewer in the output directory
This tests the static chart fallback for when Chart.js is not suitable.
Test 9: get_data_summary – basic
Prompt:
Give me a quick summary of the Grade Results dataset.
Expected: Claude calls get_data_summary
with dataset “Grade Results”. Returns row/column counts and per-column
statistics. Footer suggests execute_r for custom
analysis.
Test 10: get_data_summary – with filter
Prompt:
Summarize the User Enrollments dataset, but only for the “Student” role.
Expected: Claude calls get_data_summary
with:
- dataset: “User Enrollments”
- filter_by:
{"role_name": "Student"}
Returns stats for the filtered subset only. Row count should be less than the full dataset.
Test 11: get_data_summary – with group_by
Prompt:
Break down User Enrollments by role_name. How many of each role are there?
Expected: Claude calls get_data_summary
with:
- dataset: “User Enrollments”
- group_by:
["role_name"]
Returns group counts sorted by frequency (Student, Instructor, etc.). May also show numeric column means per group.
Test 12: execute_r – join and analyze
Prompt:
Join the Users and User Enrollments datasets. How many courses is each user enrolled in on average?
Expected: Claude calls execute_r with
something like:
users <- bs_get_dataset("Users")
enrollments <- bs_get_dataset("User Enrollments")
joined <- bs_join(users, enrollments)
joined %>%
group_by(user_id) %>%
summarise(n_courses = n_distinct(org_unit_id)) %>%
summarise(
mean_courses = mean(n_courses),
median_courses = median(n_courses)
)Returns a small summary table.
Test 13: execute_r – error handling
Prompt:
Run this R code:
nonexistent_function(123)
Expected: Claude calls execute_r. The
result should have isError: true with a message like “could
not find function ‘nonexistent_function’”. Claude should explain the
error gracefully.
Test 14: Multi-step analysis (integration test)
Prompt:
I want a complete analysis of grade performance. First describe the Grade Results dataset so I understand the columns, then show me the distribution of final grades as a histogram, and finally give me the mean grade broken down by org_unit_id (show the top 10).
Expected: Claude makes 3+ tool calls in sequence:
-
describe_datasetorexecute_rto explore columns -
execute_rwith ggplot histogram – inline image appears -
execute_rwith grouped summary – text table of top 10
This tests the full workflow: discover, visualize, summarize.
Test 15: Removed tools are gone
Prompt:
Use the get_dataset tool to download the Users table.
Expected: Claude should NOT have access to
get_dataset (it was removed). Instead it should either:
- Use
execute_rto load the data:bs_get_dataset("Users") %>% head(20) - Or use
get_data_summaryfor stats - If it tries
get_dataset, the server returns “Unknown tool” error
Test 16: list_schemas
Prompt:
What schemas are registered and what are their key columns?
Expected: Claude calls list_schemas.
Returns a list of schema names with their key columns (the foreign keys
used by bs_join()).
Test 17: AST code inspection – blocked code
Prompt:
Run this R code:
system("whoami")
Expected: Claude calls execute_r. The
result should have isError: true with a message listing
“system” as a blocked construct. The code is never
executed – it is rejected before eval. Claude should explain
that shell commands are blocked by the safety policy.
Test 18: AST code inspection – blocked package access
Prompt:
Run this R code:
httr::GET("http://example.com")
Expected: Claude calls execute_r. The
result should reject the code, citing httr::GET as blocked.
Direct access to packages like httr, curl,
brightspaceR, jsonlite, and
config is not permitted via execute_r.
Test 19: AST code inspection – safe code passes
Prompt:
Run this R code:
mtcars %>% filter(cyl == 6) %>% summarise(mean_mpg = mean(mpg))
Expected: The code executes successfully and returns a one-row tibble with the mean MPG for 6-cylinder cars. Standard dplyr, ggplot2, and arithmetic operations are not blocked.
Test 20: PII field policy – Users dataset
Prompt:
Describe the Users dataset. What columns does it have?
Expected: Claude calls
describe_dataset. The response should list columns like
UserId, Organization, IsActive, SignupDate, but should
not include FirstName, LastName, ExternalEmail,
UserName, MiddleName, or OrgDefinedId. These PII columns are stripped by
the field policy before the data reaches the workspace.
Test 21: PII field policy – Grade Results
Prompt:
Describe the Grade Results dataset.
Expected: The column list should include GradeObjectId, OrgUnitId, UserId, PointsNumerator, etc., but should not include Comments or PrivateComments.
Test 22: Audit log exists
Prompt:
Run this R code:
file.exists(file.path(output_dir, "mcp_audit.jsonl"))
Expected: Returns TRUE. The audit log
file is created at server startup and records every tool call. After
running several tests, you can inspect it directly:
Each line is a JSON object with timestamp, tool name, arguments, and status fields.
Test 23: ID pseudonymisation — UserId is hashed
Prompt:
Run this R code:
bs_get_dataset("Users") %>% select(UserId) %>% head(5)
Expected: The UserId column should contain
pseudonymised values like usr_a3f2b1c8, not raw integers.
Every value should start with usr_ followed by 8 hex
characters. This confirms that person-referencing IDs are hashed before
reaching the workspace.
Troubleshooting
Server won’t start:
- Check
claude_desktop_config.jsonhas the right path toserver.R - Ensure
cwdpoints to the directory withconfig.yml - Run
Rscript /path/to/server.Rmanually to see stderr errors
Auth fails:
- Open R interactively, run
library(brightspaceR); bs_auth()to get a token - The token is cached; the MCP server reuses it
Plots don’t render:
- Ensure the
png()graphics device is available (it should be on all standard R installs) - Check stderr for errors from
grDevices::png()
Results are truncated: