프로젝트 메모 python -home-son-prj-thesis

Marimo Cell 생성 Gotchas

Marimo 셀 생성 시 주의할 점, LLM 연동 문제 많음.

marimopythonllmfrontenddebugging

Marimo .py 노트북 생성 (papergen template) 시 자주 걸리는 함정들.

셀 바디 내 return 금지 def _(...): 래퍼 벗겨 실행, return 최상위 문장 돼 SyntaxError. if/else로 _view = ... 할당, 마지막에 _view만 두면 끝.
@mo.state는 데코레이터 아님 mo.state(initial) -> (getter, setter) 함수임. LLM이 데코레이터처럼 쓰는 실수 잦음. 캐싱은 from functools import cache; @cache 사용.
LLM 생성 viz 코드 내 삼중 따옴표 문자열 우리 indenter 깨짐. 마크다운 망가뜨림. 프롬프트에서 금지.
Headless chromium (snap) WebSocket 연결 불가 AppArmor 때문에 Marimo WS 연결 안 됨. 페이지 핑크 로딩만. WS probe로 확인.
Marimo 위젯 조건부 정의 불가 if 브랜치 안에 두면 위젯 ID 바뀌고 반응성 깨짐. 위젯 정의는 항상 실행되는 셀에.
mo.state setter on_click (0.23.4) 문제 핸들러 발동되나 하위 셀 재렌더링 안 됨. 전체 내비게이션 <a href="?page=N"> 풀 페이지 리로드로 변경.
임베디드 base64 캐시 > ~3MB Marimo WS 터짐. AssertionError. 커널-레디 페이로드 너무 큼. viz 캐시 base64 소스에서 디스크 PNG + nginx static + 작은 URL dict로 옮김.
LLM 스키마 default 쿼크 슬라이더 파라미터 default: "" 자주 뱉음. Value out of bounds 발생. [min, max]로 클램핑 처리. LLM Pydantic 스키마 채워도 하위 검증 필수.

LLM 생성 코드 에러 시 이 리스트 먼저 확인. 대부분 프롬프트 수정으로 해결됨.

여기서 배울 것

Marimo 셀 바디 내 `return` 문 사용 금지.
`@mo.state`는 데코레이터 아님, LLM이 자주 오해함.
LLM 생성 코드, 특히 스키마 `default` 값은 항상 검증 필요.
Marimo 위젯은 조건부로 정의하면 반응성 깨짐.

원본 파일 보기 (.claude/projects/-home-son-prj-thesis/memory/marimo_gotchas.md)

---
name: Marimo cell-generation gotchas (papergen specific)
description: Subtle marimo behaviors that have bitten papergen template generation - record so future iterations don't re-hit them
type: project
originSessionId: a80e3f6f-bf8c-4ef5-93e8-e5c44239beaa
---
When generating marimo `.py` notebooks programmatically (papergen template), these traps have come up repeatedly:

1. **No early `return` inside cell body**. Marimo strips the `def _(...):` wrapper before exec'ing the cell, so `return` becomes a top-level statement and raises `SyntaxError: 'return' outside function`. Use `if/else` to assign `_view = ...` and put bare `_view` as the last expression instead.

2. **`@mo.state` is NOT a decorator**. It is a function `mo.state(initial) -> (getter, setter)`. LLMs *love* to write `@mo.state\ndef cached_load(): ...` for caching. We had to add an explicit prompt rule against this. For caching inside cell bodies, use `from functools import cache; @cache`.

3. **Triple-quoted strings inside LLM-generated viz code break our indenter**. We add 4-space prefix per line to wrap code in `def _run(...):`; that prefix lands inside `"""..."""` string content too, corrupting markdown output. Forbidden in prompt.

4. **Headless chromium (snap) can't establish WebSocket** to marimo on this LAN due to AppArmor. The page just shows the pink loading gradient. WS works fine for a Python `websockets` client and for any real browser. Don't trust headless screenshots — verify cells via WS probe (count cell-ops by mimetype, look for `marimo-error` channel).

5. **Marimo widgets must be defined in a cell that always runs** (not inside conditional `if` branches), otherwise the widget identity changes on every nav/state change and reactive plumbing breaks. Pattern: dedicated `@app.cell` per widget set that just defines and returns; display happens in another cell that conditionally renders.

6. **`mo.query_params()` is reactive**. Reading `qp.get("page")` creates a dependency; calling `qp.set("page", "2")` triggers re-execution of cells that read it. This is what makes the paginated `?page=N` UX work without a full page reload.

7. **`/thesis/<id>/page/N` style URLs**: marimo only knows about query params. We do an nginx 302 redirect from `/page/N` to `?page=N` so users can type the cleaner path. After interaction, the URL bar settles to `?page=N` (marimo's `qp.set` doesn't rewrite the pathname).

8. **mo.state setter from `on_click` doesn't reliably re-run downstream cells in marimo 0.23.4** — we tried, the handler fired but cells didn't re-render with new state (logs showed "from page 0" repeatedly even after `set_page(1)`). Switched the whole nav to plain HTML `<a href="?page=N">` with full page reloads. Bullet-proof but slow without prerender. Also bypassed `mo.state` for hamburger menu toggle (used native HTML `<details>` instead).

9. **Embedded base64 cache > ~3MB blows up marimo WS** — `AssertionError: waiter is None or waiter.cancelled()` when the kernel-ready payload is too large. We moved viz cache from base64-in-source to PNG files on disk + nginx static + tiny URL dict in the cell.

10. **OpenAI gpt-5 quirks for our schema**: it sometimes emits `default: ""` (empty string) for slider params, which then becomes `value=0` in `mo.ui.slider(min=1, ...)` → `Value out of bounds`. Both `_num_default` (template) and `_coerce_default` (prerender) now clamp to `[min, max]`. Dropdown defaults that aren't in `options` fall back to `options[0]`. Lesson: every LLM that fills a Pydantic schema needs validation downstream — don't trust `default` is well-formed.

11. **OpenAI structured output API**: `client.beta.chat.completions.parse(model=..., messages=..., response_format=PydanticModel)` returns `resp.choices[0].message.parsed` as a typed instance. Way nicer than Gemini's `response_schema=Model + .model_validate_json(text)`.

12. **Gemini 2.5-pro 503 UNAVAILABLE under load** — caused a hard fail mid-pipeline. We added retries + a verify→fallback-to-draft path, but eventually the user just told us to drop Gemini and use OpenAI. Lesson: any single-LLM pipeline needs retry + graceful fallback for transient overload.

**How to apply:** Whenever the LLM-generated reproduction or viz code starts erroring, re-check this list before changing the template. Most issues are prompt-fixable.