Dated audit snapshots (2026-06-05) cross-checking every concrete name /
RPC / routing rule / table row in ARCHITECTURE.md and OVERVIEW.md against
the implementation. Kept as research artifacts under plans/research/.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Record what shipped per phase (service-path + EntityQuery request/response),
what stays deferred (subscription/push primitive, todo, browse_media
media-source caveat), the deviations (search via async_internal_search_media,
JSON-safe sandbox response, callerless raise_not_proxied retained), and the
green verification summary lines.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reflect the shipped request/response query RPCs across the docs: the
server-side query + WS-only mutation entity APIs now answer with real data
(service-path return_response + the generic entity_query RPC), so the
catalogue's status column, ARCHITECTURE §8/§14, OVERVIEW's "still open"
bullet, and the CLAUDE.md follow-up all move from "raises" to "wired". Kept
accurate as still-open: the subscription/push primitive (the */subscribe
one-shot-only rows + todo item-list push) and the media_player.browse_media
media-source caveat.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Round-trip rebuild tests for SearchMedia and Segment (the as_dict /
dataclass-asdict-vs-constructor asymmetry), per-op EntityQuery proxy tests
(media search, release notes, vacuum segments, calendar update/delete) that
assert the rebuilt typed object and the forwarded method + args, and the two
error paths: a sandbox-side ServiceValidationError translating to a
HomeAssistantError on main, and a closed channel degrading to a clean
HomeAssistantError. Client-side coverage for the EntityQuery handler:
method invocation + kwarg passing, unknown entity_id, unknown method, and a
raising method propagating its exception type on the error frame.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the remaining raise_not_proxied stubs with EntityQuery forwards +
typed rebuilds, so every query-shaped entity API now answers with real data:
- media_player.async_search_media -> async_internal_search_media (which
rebuilds the SearchMediaQuery from flat kwargs on the sandbox side, so the
query crosses as plain JSON); rebuilds SearchMedia, reusing the BrowseMedia
helper for its result list.
- update.async_release_notes -> async_release_notes (plain str/None).
- vacuum.async_get_segments -> async_get_segments; rebuilds list[Segment].
- calendar.async_update_event / async_delete_event -> the matching WS-only
entity methods (None result).
The sandbox-side serialisation is the as_dict-aware JSON encoder already
added with the handler, so SearchMedia/BrowseMedia/Segment cross verbatim.
raise_not_proxied is now callerless but kept exported for the still-deferred
subscription/todo-push primitive.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The fire-and-forget call_service channel can command an entity but can't
ask it a server-side question that has no SupportsResponse service to ride.
Add one generic EntityQuery RPC for those, mirroring the call_service path
end to end (proto -> codec registry -> bridge sender + error translation ->
sandbox handler -> proxy helper):
- proto: EntityQuery {sandbox_entity_id, method, args, context_id} and
EntityQueryResult {result} (the return wrapped as {"value": ...} so
scalar/list/None are all representable). Gencode regenerated into both
_pb2 mirrors; drift guard passes.
- MSG_ENTITY_QUERY constant + REGISTRY entry added to both protocol/messages
mirrors.
- SandboxBridge.async_entity_query builds the request, remembers the context
before the id is reduced to a wire value, translates remote/closed errors
through the existing paths, and unwraps {"value": ...}.
- EntryRunner._handle_entity_query resolves the entity on the private hass,
invokes the named method with the decoded kwargs, and serialises the return
through the as_dict-aware JSON encoder; raised HA/voluptuous errors
propagate as channel error frames so main rebuilds the same shape.
- SandboxProxyEntity._entity_query is the proxy-side companion to
_call_service.
No proxy op is wired onto it yet — that is the next phase.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the three query-shaped entity APIs that have a SupportsResponse
service onto the existing call_service + return_response channel, so a
sandboxed entity answers them with real data instead of raising:
- calendar.async_get_events -> calendar.get_events service, rebuilding
list[CalendarEvent] from the response (explicit field mapping, ISO
date/datetime parse — not a **dict splat).
- weather.async_forecast_{daily,hourly,twice_daily} -> weather.get_forecasts
service; Forecast is a plain TypedDict, returned verbatim.
- media_player.async_browse_media -> media_player.browse_media service,
rebuilding the recursive BrowseMedia from its frontend-shaped as_dict.
SandboxProxyEntity._call_service grows a return_response flag that decodes
the CallServiceResult response into a dict. The sandbox-side call_service
handler now runs rich service responses (e.g. a BrowseMedia object keyed by
entity_id) through the as_dict-aware JSON encoder before packing the Struct,
yielding the exact wire shape main rebuilds from.
Caveat documented at the browse_media call site: a sandboxed player's browse
surfaces only its own sources; the media_source tree is empty inside the
sandbox (media_source runs on main). Round-trip rebuild unit tests cover the
as_dict-vs-constructor asymmetry first (plan Risk #2).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document the unproxied query/subscribe/WS-only entity APIs, their interim
raise behaviour, and the two missing primitives (request/response +
subscription RPC) in docs/query-shaped-rpcs.md. Add the implementation
plan (plan-query-rpc.md): a generic EntityQuery RPC for the service-less
ops + reuse of the existing call_service return_response path for ops
that have a SupportsResponse service. Note the media_player.browse_media
caveat (no media_source tree inside the sandbox). Cross-reference from
ARCHITECTURE/OVERVIEW/CLAUDE.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The server-side query / subscribe / WS-only-mutation entity APIs the
fire-and-forget call_service bridge can't express (calendar listings +
event update/delete, weather forecasts, media browse/search, update
release notes, vacuum segments) previously returned empty/None silently.
Add entity.raise_not_proxied and have those proxy methods raise
HomeAssistantError instead, so the gap fails loudly until a real query
RPC lands.
todo is a special case: its To-do panel reads the sync todo_items
property that also feeds TodoListEntity.state, so it can't be a query at
all. Route it to main via SANDBOX_INCOMPATIBLE_PLATFORMS and drop the
proxy (matching the camera/image precedent).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both docs now describe the translation-forwarding subsystem in the body,
not just the goal: live pull (sandbox/get_translations RPC + provider
overlay) and the picker catalog hook.
- OVERVIEW: add a Translation forwarding section + "where to look" row +
v1-diff row. Fix pre-existing drift: ALWAYS_MAIN is 24 entries across
three groups (was listed as 6), failed-sandbox setup is SETUP_ERROR
(not SETUP_RETRY), and the manager runs no periodic ping loop.
- ARCHITECTURE: add §11 Translation forwarding (renumber following
sections), list translation.py/catalog.py in §2, and correct the core
touch surface from three to five hooks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Translation forwarding (live pull-RPC + catalog provider) now puts the
sandboxed integration's translations on main alongside its entities,
services, and events — note it in the OVERVIEW + ARCHITECTURE goals.
Remove the generated architecture.html; the architecture is published to
a gist instead of carrying a rendered artifact in the tree.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add sandbox/docs/catalog-provider-contract.md describing the display-only
picker catalog hook: the discoverability gap it closes, the
async_register_sandbox_catalog_provider API, and the contract — separate
from the sha-pinned source resolver, name load-bearing, title_translations
optional, no validation, display-only scope, and how it complements the
Phase B live RPC for the cold picker case.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the A1 catalog hook into the two display paths a sandbox-only custom
falls through today:
- async_get_integration_descriptions (loader.py): append catalog
descriptors to the custom integration/helper buckets so the add-
integration picker lists them. On-disk customs carry richer metadata,
so the disk scan wins on a domain collision.
- _async_get_component_strings (helpers/translation.py): when a domain
has no on-disk Integration (IntegrationNotFound on main), take its
"title" from the catalog — a localized title_translations[lang] if
present, otherwise degrading to the descriptor name.
Tests: catalog entry appears in descriptions with picker name + defaults
+ helper-bucket routing; on-disk custom wins a collision; title fallback
uses title_translations and degrades to name when absent.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a separate, display-only catalog hook so a custom integration whose
code lives only in a sandbox (never on main's disk) can be listed and
named in the add-integration picker without spawning a sandbox.
Core (homeassistant/loader.py) owns the registry because core consumes
it: SandboxIntegrationDescriptor, SandboxCatalogProvider,
DATA_SANDBOX_CATALOG_PROVIDERS, async_register_sandbox_catalog_provider,
async_get_sandbox_catalog. This mirrors the Phase B translation-provider
precedent (hook + consumer co-located in core).
homeassistant/components/sandbox/catalog.py re-exports the hook so HACS
registers through a sandbox namespace parallel to the source resolver —
but the catalog stays deliberately separate from the sha-pinned, security-
critical source resolver: it is eager, enumerable and cosmetic only.
Wired into descriptions + title fallback in A2.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implement SandboxTranslationProvider and register it into core's translation
hook from async_setup (unregistered on stop). For each requested component it:
- resolves the owning sandbox group — a loaded entry's .sandbox field wins,
else the live SandboxFlowProxy of a brand-new custom's in-progress flow
(new sandbox_group accessor on the proxy);
- carves out built-ins (Integration.is_built_in ⇒ main reads its byte-identical
disk copy, never the wire);
- batches each group's custom domains into one get_translations RPC per
language (5s timeout), and degrades to empty strings on a down/closed/slow
channel so the cache-lock overlay never blocks the frontend.
router.async_unload_entry now invalidates a sandboxed entry's cached
translations, so a reload at a new integration-source ref re-pulls fresh
strings on the next fetch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a sandbox-agnostic seam to the translation cache, mirroring the
sandbox.sources source-resolver convention:
- async_register_sandbox_translation_provider(hass, provider): a HassKey-backed
registry with an unregister callback. The provider is awaited inside the
cache load and returns {language: {domain: raw_strings}} for only the domains
it owns.
- _TranslationCache._async_load overlays the provider result onto
translation_by_language_strings after async_get_integrations and before
_build_category_cache, so sandboxed strings flow through the same flatten /
English-fallback / loaded machinery as on-disk strings. A custom sandboxed
domain (IntegrationNotFound on main) thus stops resolving to {}.
- _TranslationCache.async_invalidate + async_invalidate_translations wrapper:
the first eviction API (translations were never unloaded), called by the
sandbox when a custom integration is re-fetched at a new ref.
Core never raises on a provider; degrade-to-empty is the provider's contract.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Register a sandbox/get_translations handler in SandboxRuntime. It loads raw
translation strings for the requested domains from the sandbox's own
filesystem (built-in from the bundled package, custom from the fetched
<config>/custom_components/<domain>) by reusing core's
_async_get_component_strings against the sandbox-private hass — which also
pre-fills 'title' from integration.name. Main cannot run that fallback for a
custom domain because it holds no Integration, so the title must be injected
here. Replies with {language, strings: {domain: raw dict}}.
Tests cover built-in title pass-through, custom title injection, the empty
case, the Struct packing, and the no-flow-runner guard.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the sandbox/get_translations message pair to the control-channel proto
and regenerate the checked-in gencode for both no-cross-import mirrors.
Mirror MSG_GET_TRANSLATIONS in both protocol.py files and register the
message pair in both messages.py REGISTRY copies.
Request {language, domains[]}; result {language, strings: {domain: raw
strings.json dict}} — main batches a group's custom domains into one call;
built-in domains never cross the wire.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brainstorm → plan for forwarding a sandboxed integration's translations
into main: live pull-RPC (Phase B) for running integrations + a catalog
provider (Phase A) for picker discoverability. Includes interview,
research notes, scratchpad, and the phased plan.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rewrote the briefing so it never frames the brief as a file or mentions
the former tempfile handoff: compose it, pipe it straight into the session
(heredoc), claude-screen pastes it as one message. Dropped the file-pipe
example and the "no tempfile dance" aside.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Now that claude-screen pastes multi-line directly, the brief no longer
needs a tempfile + "Read /tmp/X" pointer. Step 1 reads "Compose the brief"
(source it from a heredoc or any scratch file) instead of "Write the
brief"; step 2 shows both heredoc and file pipes. The brief is just stdin,
not a handoff artifact.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Follow-up to the status/ reorg: the brief's STATUS path and the monitor
until-loop both point at sandbox/status/STATUS-<plan>.md.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tidy the directory: the 29 per-phase + per-plan landing records
(STATUS-phase-*.md, STATUS-plan-*.md) move out of the sandbox/ root into
sandbox/status/ (git mv, blame preserved). Live current-state docs
(CLAUDE.md, README.md, OVERVIEW.md, FOLLOWUPS.md, architecture.html, the
docker-compose harness comment) now point at status/. Historical records
(the STATUS bodies themselves, plans/*.md, plan.md) keep their original
text by convention.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
claude-screen pastes multi-line prompts directly: bracketed-paste markers
keep embedded newlines literal, and the submit \r is sent as a separate
keystroke a beat later (the concatenated \r was what raced the paste and
submitted mid-prompt). Verified live — a 3-line prompt lands as one
message. Doc no longer mentions any file-handoff.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The single-line file-handoff is now built into claude-screen itself (it
detects a newline in the piped prompt, stashes the brief to a tempfile,
and pastes a pointer). So the doc just pipes the brief straight in; the
manual "write to /tmp + echo a single line" dance is gone.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Make explicit that each sub-session steps through its plan with the
phx:work skill (task-by-task with per-step compile/test verification),
not ad-hoc edits. Added as a brief hard rule + a why-this-shape bullet.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the loop used to build this batch: write a brief to a tempfile,
spawn a fresh Claude in a screen window via single-line file-handoff, watch
for a STATUS marker, verify independently, push, kill the window. Captures
the gotchas that bit (single-line stdin, prompt-submit confirmation,
prefix-match window names, orchestrator-only push).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drop the build-phase scaffolding from the test name; it just verifies ai_task
and image pin to ALWAYS_MAIN. -> test_ai_task_and_image_pin_to_main.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The one-shot full cross-sweep that produced the original backlog
(run_compat_full.py + categorize_failures.py + generate_backlog.py) and its
machine-generated outputs (COMPAT_FULL.md/.csv, COMPAT_LATEST.md, COMPAT.csv,
BACKLOG_FAILURES.json) were Phase-16 measurement scaffolding; the gate is long
cleared. Keep the single ongoing runner (run_compat.py) and the two curated
summaries (COMPAT.md, BACKLOG.md). Git-ignore the per-run machine output so it
stops being checked in. Living docs updated; recover the full-sweep tooling
from git history if a fresh tree-wide sweep is ever needed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The registered-service forwarder (_build_service_forwarder._forward) rebuilt its
own pb.CallService request and duplicated the ChannelRemoteError/
ChannelClosedError translation that _raw_call_service already does. With the
batcher gone, _raw_call_service is the single low-level send helper — have
_forward call it and keep only its response-extraction logic. No behaviour
change (the channel-closed error message is now the shared generic one).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The spike (hass_client/spike/: bridge_a, bridge_b, rig, synthetic_light,
transport + tests/components/sandbox/test_spike.py) was a one-off bake-off to
choose between entity-bridge designs. Option B was chosen and shipped long ago;
nothing in production imports the spike, only its own test did. Delete it.
docs/entity-bridge-decision.md keeps the rationale and the measured numbers as
the decision record, with a note that the harness is recoverable from git
history.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Each proxy entity service call now forwards as its own single
`sandbox/call_service` RPC. The per-loop-tick coalescing batcher
(_CallServiceBatcher / _BatchBucket) added complexity the first iteration
doesn't need, so it is removed; async_call_service calls _raw_call_service
directly. Behaviour is unchanged except a multi-entity area call now pays one
RPC per entity instead of one coalesced RPC.
Coalescing same-tick calls is recorded as a future optimisation in
docs/FOLLOWUPS.md (with the 200-light perf benchmark that validated it). Living
docs updated; the phase-history records are left as-is.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A service call is never fire-and-forget: each batched caller awaits the
coalesced RPC's completion via its future, which resolves with the result or
the raised error, so every caller learns when its call finished. Batching only
shares the *wire* call, not the await; only a response *value* can't be
coalesced (hence the response bypass). Wording-only; no behaviour change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The final deliverable should not carry the scaffolding of the phases it was
built in. Reword comments, docstrings, and generated-output strings that named
build phases (Phase N / T1-T3 / Phase A1-A2) to describe what the code does,
and rename the phase-numbered test files:
test_phase4_subprocess -> test_subprocess
test_phase9_shutdown -> test_shutdown
test_phase13_proxies -> test_domain_proxies
test_phase14 -> test_schema_and_unload
test_phase19_devices -> test_device_registry
Comments/docstrings/filenames only; no logic changes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Replace a stray function-local `import json` (json.dumps behind a bogus
'keeps json off the integration boot path' noqa) with HA's orjson
json_bytes_sorted helper for the call-service batch key.
- A response-returning entity call now bypasses the per-tick batcher.
Coalescing forces every caller in a bucket to share one combined response,
which is wrong when a caller needs its own value; response calls go out as
their own single-entity RPC. The batcher is now fire-and-forget only, so its
dead return_response plumbing is dropped.
- Also removes development-phase references from this file's docstrings.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rewrite docs/auth-scoping-decision.md to lead with the shipped design: the
sandbox holds no credential and cannot fabricate a Context; main restores
attribution from a TTL cache of contexts it issued and falls back to
user_id=None. The reverted, never-shipped scoped-token mechanism is kept as a
clearly-marked appendix for whenever the sandbox->main websocket lands. Update
the CLAUDE.md pointer to match.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rename identifiers (SandboxV2Data->SandboxData, DATA_SANDBOX_V2->
DATA_SANDBOX, SandboxV2Error->SandboxError), env vars (SANDBOX_V2_*->
SANDBOX_*), and stale sandbox_v2/ paths in the current-state docs
(OVERVIEW, README, CLAUDE, plan, architecture.html, COMPAT*, docs/*),
and reword prose that named the current sandbox "v2". Historical
STATUS-phase-*/plans/* records are left intact as point-in-time history.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The rename sweep missed several identifiers, env vars, and the
pre-commit drift-guard hook (whose entry/files paths still pointed at
the non-existent sandbox_v2/ tree, leaving the hook broken). Rename:
- SandboxV2Data -> SandboxData, DATA_SANDBOX_V2 -> DATA_SANDBOX,
SandboxV2Error -> SandboxError (+ all references and tests)
- SANDBOX_V2_ERRORS_DIR/TRANSPORT/SOCKET_PATH -> SANDBOX_* env vars
- pre-commit hook id/entry/files: sandbox_v2/proto -> sandbox/proto
- stale sandbox_v2 paths and 'v2' wording in .dockerignore + scripts
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reverted Phase-7 auth-scoping mechanism never shipped, so no
real auth store carries a legacy "scopes" key. Remove the defensive
pop in AuthStore (RefreshToken is built by explicit field mapping, so
unknown keys are ignored anyway) and its test. Reword the
ConfigEntry.sandbox load comment to state the real reason the key is
optional (non-sandboxed entries omit it) instead of referencing an
unreleased phase.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review feedback on ARCHITECTURE.md:
- Goal (§1) now names storage alongside setup/flow/entities/services/
events, and adds a short statelessness line (storage routes to main,
code is fetched at startup → wipe-and-restart safe).
- Auth (§10) trimmed to describe the current design — no credential, no
user — instead of narrating the token's removal. The removal history
lives in the changelog where it belongs.
- Dropped the "(the Iron Law: never monkey-patch private internals)"
parenthetical in §11; the plain "declared public hook rather than a
reach into private internals" already carries the point.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Landing notes: how the context cache was seeded (forwarder + entity-call
path), the 15-min TTL bound, confirmation the token + system user are fully
gone (greps), test results, and doc updates.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reconcile the architecture docs with the plan-auth-context landing:
- ARCHITECTURE.md §2/§5/§8/§10/§13 + changelog: auth.py removed from the
component table; the spawn command no longer carries --token; §8 documents
the implemented context restoration (TTL cache, own-id minting, ULID-trust
reasoning); §10 rewritten — no token, no system user, the future
Context-group-attribute note retained.
- OVERVIEW.md: auth comparison row, the spawn-command blocks, the
EventMirror context paragraph, the auth section (now "no credential" +
a Context-restoration subsection), and the file-pointer table.
- FOLLOWUPS.md: a plan-auth-context narrative entry; the open follow-ups now
describe the fresh-credential-when-WS-lands work and the Context group
attribute idea.
- auth-scoping-decision.md / CLAUDE.md: note the token + system user are now
also gone (already SUPERSEDED).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
plan-auth-context Parts A/B/C — a design-review follow-up. The sandbox is
not an authenticated principal inside main and must never be able to author
a Context.
Part A — drop the unused token. The manager minted a per-group system-user
access token and passed it on --token; the runtime stored it
(SandboxRuntime.token) and never used it (no connection back to main to
authenticate). Removed end-to-end: --token argv (manager._default_command),
the token_factory wiring, SandboxRuntime.token field/param + --token CLI
arg, and SANDBOX_TOKEN in the Docker entrypoint / compose / docs.
Part C — drop the per-group system user. auth.py is deleted entirely
(async_issue_sandbox_access_token + async_get_or_create_sandbox_user gone),
along with bridge._async_system_user_id / _system_user_id. A genuinely
sandbox-originated context is now user_id=None — the honest shape, since no
user authored it.
Part B — context-id restoration. The bridge now seeds a context_id→Context
cache at every main→sandbox call-down site: the service forwarder (_forward)
and the proxy entity's service call (async_call_service, which threads the
entity's live Context). _resolve_context returns a cached Context verbatim
for a known id (restoring the original parent_id / user_id), so a
user-initiated action's attribution survives the round-trip. An unknown or
expired id mints a brand-new Context(user_id=None) with main's own trusted
id — never the sandbox-supplied ULID, whose embedded timestamp main cannot
trust (recorder/logbook order by it); the sandbox string is a cache key
only. The cache is bounded by a 15-minute TTL (lazy front-pruning, plus a
sanity count backstop); a miss is always safe.
Tests: known-id restore end-to-end via the forwarder; unknown→fresh with no
adopted id; no-forgery (the wire proto has no parent_id/user_id field); TTL
expiry degrades to a fresh context; spawn argv no longer carries --token.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
User refinement 2026-06-03:
- Bound the context cache by TIME (15-min TTL), not size. Volume is tiny
(only main→sandbox service-call contexts, echoed back within seconds).
- For an unknown context_id, main must mint a BRAND-NEW Context with its
OWN id — never adopt the sandbox's id. context_ids are ULIDs with an
embedded timestamp and main cannot trust the sandbox's clock (a crafted
ULID could back/forward-date events; recorder/logbook order by it). The
sandbox-supplied id is only a cache key, never the resulting Context's
identity. This corrects T2's current Context(id=context_id, ...) for
unknown ids.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User decision 2026-06-03: drop the sandbox system user entirely;
sandbox-originated contexts use user_id=None (no reason for the sandbox
to be a user right now). Future-work note recorded: a Context with a
group attribute is the better long-term answer for audit attribution,
but needs a core Context field change and waits until it's needed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
From review feedback on the architecture doc:
1. register_service schema serialisation: broaden serialize_schema's
fallback from `except (ValueError, TypeError)` to any exception (with a
warning log). An exotic custom validator could raise other types and
propagate, dropping the whole service registration on main. Now it
always degrades to schema=None — main registers the service, the
sandbox validates. Added test_schema_bridge.py covering the broad path.
2. Clarify in ARCHITECTURE.md that MAIN alone decides the sandbox group:
the group comes from main's classify(), the proxy overwrites
create_result["sandbox"] with the main-determined value, and the wire
FlowResult has no group field. The sandbox can shape its own forms but
cannot influence storage/routing. (Code was already correct; the doc
wording was loose.)
3. Note the --token is unused (sandbox is not an authenticated principal
inside HA) and slated to drop; the per-group system user's only live
use is context attribution, under reconsideration.
4. Document the intended context model: wire carries context_id only;
main restores parent_id/user_id from a seen-id cache so the sandbox can
never fabricate attribution; unknown ids get a fresh no-parent context.
Points 3+4 captured as a buildable follow-up: plans/plan-auth-context.md.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A self-contained, final-state architecture reference for the sandbox:
goal, components, routing, protobuf channel + pluggable transports,
lifecycle, config-flow forwarding, statelessness/integration-source,
entity/service/event bridging, store routing, auth, core-HA touch
surface, testing/Docker, and out-of-scope/future work. Changelog at the
bottom summarises the closing batch (contextvar, strip-auth-scopes,
fidelity, lockdown, transport, ephemeral-sources, docker, rename).
Distinct from OVERVIEW.md (source-linked depth) and architecture.html
(phase-by-phase historical artifact) — this is the current-state-only
narrative.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- whats-changed.md (batch tracker): added the `sandbox_v2` → `sandbox` rename
as a TL;DR bullet + a breaking-change entry (pre-release, wipe-and-restart).
- sandbox/CLAUDE.md: reworded the intro to current-state ("the sandbox rewrite,
formerly sandbox_v2"); clarified that the removed v1 previously occupied the
same paths the rewrite now lives at.
Historical records (STATUS-*.md, plans/*.md, docs/auth-scoping-decision.md)
keep their sandbox_v2 mentions intact — they document work done against those
paths.
The IGNORE_INTEGRATIONS_WITH_ERRORS = {"sandbox"} set tolerated v1's hassfest
violations while v2 stabilised. v1 is gone and the renamed integration (former
v2) is hassfest-clean, so the set + the two conditionals consulting it are
removed — keeping it would mask real errors in the renamed `sandbox`.
hassfest validate + generate: 0 invalid integrations, no generated-file changes
(sandbox has no config_flow → absent from config_flows.py; the NO_QUALITY_SCALE
entry was renamed by the Phase B sweep).
Phase B of plan-rename-sandbox. Mechanical identifier sweep + the structural
fixups the rename forced (tree compiles + both suites pass at the end):
- Bare-token sweep `sandbox_v2` → `sandbox` across all code + current-state
docs (excluding historical STATUS-*.md, plans/*.md, auth-scoping-decision.md
and the generated _pb2 gencode). Channel message strings, storage-key
namespace, client_id prefix, manifest domain, logger names all move.
- Prose sweep `Sandbox v2` → `Sandbox` (covers the `Sandbox v2: ` system-user
name prefix → `Sandbox: `).
- Protobuf: renamed sandbox_v2.proto → sandbox.proto (package `sandbox`) and
REGENERATED gencode (sandbox_v2_pb2 → sandbox_pb2) in both mirrors via the
isolated-venv recipe; removed the old _pb2 files. Drift guard clean.
- Name-collision fix forced by the rename: the client had both the impl module
`hass_client/sandbox.py` (exports SandboxRuntime) AND the `-m` launcher
subpackage `hass_client/sandbox_v2/`. Renaming the launcher to `sandbox`
collides with the impl module, so merged them — sandbox.py is now
`hass_client/sandbox/__init__.py` (parent-relative imports rewritten to
absolute `hass_client.*` per ruff TID252) with the launcher's __main__.py
kept. `python -m hass_client.sandbox` and
`from hass_client.sandbox import SandboxRuntime` both work.
- Docker entrypoint/compose/docs → `python -m hass_client.sandbox`.
- Client distribution renamed `hass-client-v2` → `hass-client` (import package
`hass_client` unchanged; matches the egg-info already installed).
Tests green: HA-side 201 passed, client 70 passed. prek clean on changed set.
Phase A of plan-rename-sandbox. Pure renames via git mv to preserve blame:
homeassistant/components/sandbox_v2 → homeassistant/components/sandbox
tests/components/sandbox_v2 → tests/components/sandbox
sandbox_v2 → sandbox
sandbox/hass_client/hass_client/sandbox_v2 → sandbox/hass_client/hass_client/sandbox
sandbox/proto/sandbox_v2.proto → sandbox/proto/sandbox.proto
The untracked tests/testing_config/.storage/sandbox_v2 dir is a runtime test
artifact (not tracked); left as-is. The tree does NOT import or pass tests
after this commit — Phase B sweeps every sandbox_v2 identifier + regenerates
the protobuf gencode.
Multi-stage python:3.14-slim image that runs the hass_client sandbox
runtime (python -m hass_client.sandbox_v2). Installs homeassistant +
hass_client into a venv; no pre-baked integration requirements (runtime
pip-installs them on demand), no git (codeload tarball fetch), non-root,
no volumes, no healthcheck, env-driven entrypoint via tini. Closes the
pip/egress runtime gap flagged by plan-ephemeral-sources: the container
is where pip + network egress live.
Transport caveat: unix socket (T3) today, websocket (T4) later — not a
remote-ready artifact. The docker-compose.test.yml captures the intended
same-host unix-socket harness but does not run against today's manager
(private mkdtemp socket path + spawn-not-attach model); both gaps are
documented, not hacked.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Docs sweep for the stateless-sandbox feature (d4b7aef732): protocol.py
integration_source field, OVERVIEW entry-lifecycle + statelessness
section, CLAUDE.md resolver-hook contract + sha-pin rule + pip/egress
follow-up, architecture.html fetch-before-setup, whats-changed box ticked.
The sub-session wrote these files but didn't land the second commit;
committing them here. STATUS flags two honest follow-ups: tree-vs-ref
verification trusts GitHub's content-addressed codeload URL rather than a
full git-tree-hash; async_process_requirements (pip for custom deps) is
unconfirmed in the bare-HA sandbox — pairs with plan-docker.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Make sandboxes hold no integration code: main attaches a typed
IntegrationSource to EntrySetup (builtin no-op, or a git source pinned to
an exact commit sha), and the sandbox fetches custom (HACS) code into
<config>/custom_components/<domain> before async_setup.
- proto: IntegrationSource sub-message + EntrySetup.integration_source (10);
both _pb2 mirrors regenerated.
- core sources.py: registered-resolver hook (async_register_sandbox_source_
resolver) keeping core HACS-agnostic; builtin short-circuit; a custom
domain with no resolver raises. Resolver pins ref to a sha (no core I/O).
- router: _entry_setup_payload resolves + sets integration_source.
- client sources.py: codeload-tarball fetch (injectable primitive),
process-lifetime (url, ref) cache, manifest.json verification; wired into
entry_runner before setup.
- tests: resolver registry + payload (HA side), tarball fetch/cache/verify
with local fixtures (client side). No network in any test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
T3 (1eaa79d261) + T5 (42560c6cd0) landed. Unix socket transport
(opt-in via SandboxManager(transport="unix"); stdio remains default),
ws:// rejected with NotImplementedError, wire-protocol docs current.
191 + 62 tests green; drift guard clean. T1→T2→T3→T5 complete; T4 (WS)
out of scope. UnixSocketTransport is StreamTransport-over-unix-streams
(no new class). Socket lives in a short tempdir to dodge the ~108-char
sun_path limit; teardown force-closes accepted clients to avoid a
wait_closed() hang.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bring the wire-protocol docs to the current protobuf + pluggable-transport
reality and tick the transport trackers.
* OVERVIEW.md / architecture.html: rewrite the transport row, the spawn
prose, and the channel section to describe the three-layer
Channel/Codec/Transport split, ProtobufCodec as the production wire,
the Ready-frame handshake (no stdout text marker), length-prefixed
framing, and stdio + unix transports (websocket reserved/future). Drop
the stale --url ws:// example and JSON-line wording.
* channel.py docstrings (both mirrors): ProtobufCodec is the production
codec; JsonCodec is the registry-free channel-core test/debug wire.
* protocol.py docstring: messages are typed protobuf (REGISTRY +
sandbox_v2.proto); the payload shapes listed are the logical contract.
* sandbox.py: SandboxRuntime docstrings note the --url-selected transport
(stdio default, unix opt-in, ws reserved).
* whats-changed.md: tick the protobuf-wire + typed-handlers boxes (T2
360e454330) and pluggable-transports box (T3 1eaa79d261).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an opt-in unix-domain-socket control-channel transport alongside the
default stdio transport. The manager opens a listening unix socket, passes
its path to the subprocess as --url unix://<path>, and the runtime dials
back; the manager is the server. Both transports reuse StreamTransport's
length-prefixed framing, so no dedicated unix transport class is needed.
* Manager: SandboxManager(transport="stdio"|"unix") (default stdio,
unchanged behavior). _run_one splits into stdio/unix paths sharing a
_supervise_until_exit helper; the unix path creates the socket in a
short per-attempt tempdir (sidesteps the ~108-char sun_path limit),
races accept against early exit, and force-closes lingering accepted
connections (server.close_clients) so wait_closed cannot hang.
* CommandFactory is now (group, url) -> argv; the manager owns the
transport and hands the factory the right --url.
* Runtime: --url scheme selects the transport — stdio:// (default /
absent), unix://<path>, or ws://|wss:// (reserved, rejected with a
clear not-implemented error). New _transport_scheme + _open_unix_channel.
* Tests: unix round-trip + socket cleanup (core), scheme selection + ws
rejection + unix round-trip (client); existing factories updated to the
(group, url) signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
T2 landed as 360e454330 (64 files, +3762/-1046). Sub-session report:
- default production codec = ProtobufCodec; ~20 handlers + ~69 test
sites converted atomically; 189 + 53 tests green; prek + drift guard clean.
- 4 reasoned deviations (bare Channel ctor keeps JsonCodec with proto
built explicitly at production sites; JsonCodec stays registry-free for
channel-core tests; grpcio-tools out of project deps via throwaway venv;
sandbox-side context cache deferred until a consumer needs it).
- One gotcha fixed: a test stub returning a plain dict hung the router's
untimed channel.call under ProtobufCodec — relevant for T3/T5.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Atomic switch of the control channel from JSON dicts to typed protobuf
messages, completing transport T2 on top of T1's Transport/Codec seam.
- Codec owns the registry: each side builds a type -> (request_cls,
result_cls) map from its own _proto mirror and constructs
ProtobufCodec(registry). The concurrency-critical Channel core stays
fully codec-agnostic; response frames now carry `type` so the stateless
codec resolves the result class on both encode and decode.
- Proto refinements (locked 2026-06-03): EntityDescription wraps EntityInfo
(identity: Description + DeviceInfo) and InitialState (state +
capabilities + attributes); ServiceResponse is a typed envelope inside
CallServiceResult (proto3 optional, no has_response bool); StateChanged
is flattened and carries optional context_id; FireEvent carries optional
context_id. Dynamic fields cross as Struct/ListValue.
- Context security model: the sandbox only ever sends a context_id string;
parent_id / user_id never cross the wire. Main resolves the id to its own
authoritative Context via SandboxBridge._resolve_context — reusing a
cached Context or minting a fresh one attributed to the sandbox system
user with no parent_id — for state_changed, fire_event and call_service.
- Generated _pb2 mirrors checked into both no-cross-import trees; regen via
sandbox_v2/proto/generate.sh (isolated venv so the protobuf==6.32.0 pin is
never bumped). Drift guard wired as a manual-stage prek hook that degrades
gracefully when uv is absent.
- Default codec is protobuf (manager + runtime channel construction);
JsonCodec is retained registry-free as the test wire for the channel-core
tests. protobuf added to the client pyproject + the HA manifest
requirements; grpcio-tools stays out of the project venv by design.
- ~20 handlers converted to typed messages across bridge.py, entry_runner,
flow_runner, entity_bridge, service/event mirrors, sandbox_bridge and the
schema bridge; ~69 test call/push sites translated with no assertion
loosening (semantics shifts forced by proto presence are commented).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
User direction 2026-06-03 — capture before T2 launches.
Proto schema changes:
- Group fields the way HA organizes them. EntityDescription (wire) gets
an EntityInfo sub-message (HA's EntityDescription dataclass fields +
DeviceInfo) and an InitialState sub-message (initial state +
capabilities + initial attributes). Nested EntityInfo.Description
avoids the recursive-name clash.
- ServiceResponse is now a typed message (was Struct in the draft); the
dynamic payload sits inside it as a Struct field. CallServiceResult
drops the has_response boolean in favor of proto3 `optional`.
- StateChanged gains an optional context_id (was missing entirely).
Context discipline (security):
- parent_id and user_id are NEVER serialized on outbound messages from
sandbox. The wire carries context_id only.
- Sandbox keeps a local context_id -> Context cache main populates when
relevant (e.g. main pushing a state-changed for a context the sandbox
needs).
- Main resolves context_id to its authoritative Context at dispatch.
If no such Context exists, main mints one attributed to the sandbox's
system user (no parent_id) and registers it.
WebSocket transport is now flagged COMPLETELY OUT OF SCOPE for this
effort (was "deferred"). T1's Transport Protocol is shape-compatible
with a future WebSocketTransport, but no WS code/deps/auth surface
lands in this batch.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T1's STATUS surfaced one design refinement to ratify before T2 coding:
the type -> (request_cls, result_cls) protobuf registry lives on the
codec, not on Channel.register. Ratified 2026-06-03.
The argument (from T1's sub-session): keeping the pairing in the codec
preserves the plan's stated safety property — the concurrency-critical
Channel core stays codec-agnostic. ProtobufCodec(registry) / JsonCodec(registry)
on each side; Channel.register signature unchanged.
For responses to be decodable without per-call state, the proto Frame
envelope carries `type` on response frames too (already a field; just
populate it).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T1 (Transport/Codec seam + Ready frame) shipped green at 8389f7ad96.
T2 (protobuf wire + typed handlers) is an atomic big-bang — flipping the
default codec to protobuf and switching ~20 handlers to typed messages
forces ~69 wire-call test sites to convert in lockstep, so it cannot land
in safe green increments the way T1 was designed to. Rather than ram a
big-bang through and risk a broken tree (or silently weakened test
assertions during a 69-site rewrite), this STATUS hands off:
* the cleared codegen toolchain gate + verified recipe (isolated venv,
grpcio-tools 1.80.0, gencode min-runtime 6.31.1 ⊆ pinned 6.32.0)
* the resolved T2 design — including the response-typing solution the
plan left implicit (carry `type` on response frames) and a refinement
to keep the request/result class registry in the codec (Channel core
stays codec-agnostic) for the parent to approve
* the full T2 file/test work breakdown, and the small T3/T5 shapes
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Split the control channel into three layers so the wire format and the
byte transport can each be swapped without touching the concurrency-
critical dispatch core:
* Channel — dispatch core (pending map, inflight semaphore, register/
call/push/close); speaks Frame objects, never raw bytes.
* Codec (Protocol) + JsonCodec — Frame <-> bytes. JsonCodec is
line-compatible with the old wire shape.
* Transport (Protocol) + StreamTransport — whole frame blobs over a
reader/writer pair using a 4-byte big-endian length prefix (caps frame
size at 16 MiB and aborts the channel on overflow). Channel.from_transport
is the drop-in seam for a future WebSocketTransport.
Replaces the stdout text marker (sandbox_v2:ready) with a MSG_READY
*frame* sent as the channel's first message; the manager registers a
handler for it and flips to "running" on arrival, so stdout now carries
nothing but channel frames. Net behavior identical — still JSON, still
stdio — only the framing and handshake changed.
Both channel.py mirrors and protocol.py mirrors updated in lockstep.
Handshake/marker test assertions updated; added coverage for the
from_transport seam via an in-memory queue transport.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tick the 6 batch boxes in whats-changed.md with their commit SHAs. Refresh
current-state docs the 6 changes affect: OVERVIEW (upsert + registry-event
resend, unique_id prefix, vol.Invalid rebuild, real selectors/sections),
README + architecture.html (--group -> --name run snippets). Add the batch
STATUS file. Historical records (STATUS-phase-*, interview, plan-v1-removal,
FOLLOWUPS) left intact.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Broad readers (template, group, homekit) and source-entity helpers
(min_max, statistics, trend, threshold, derivative, integration,
utility_meter, filter, mold_indicator, bayesian, generic_thermostat,
generic_hygrostat, switch_as_x, history_stats, proximity) read foreign
entities / registries a sandboxed integration can't see under lockdown,
so pin them to main. prometheus/alert are config_flow:false (YAML-only)
and already stay on main, so they're not added.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
reconstruct_schema now rebuilds real Selector and data_entry_flow.section
objects instead of collapsing them to a pass-through validator, so when the
flow manager re-serialises main's schema for the frontend it reproduces the
sandbox's original list verbatim (selectors keep their widget). The
serialize-side _has_data_schema fallback now logs the dropped schema's repr
at warning so the lossy path is visible.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Client EntityBridge now listens on EVENT_ENTITY_REGISTRY_UPDATED and
EVENT_DEVICE_REGISTRY_UPDATED and re-describes + re-sends MSG_REGISTER_ENTITY
for tracked entities, guarded by a description hash to avoid event storms.
Main's _handle_register_entity updates the existing proxy in place (refreshing
the mirrored _attr_* fields and the DeviceEntry) instead of adding a duplicate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Proxies all register under the shared sandbox_v2 platform_name, so the
entity-registry uniqueness key (domain, "sandbox_v2", unique_id) collided
when two integrations in one group reused a unique_id. Namespace the proxy
unique_id as f"{source_domain}:{unique_id}". None unique_ids stay None.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Carry a structured error_data field on the error frame for vol.Invalid /
vol.MultipleInvalid so main rebuilds the real exception with its path
intact instead of flattening to TypeError. Falls back to the class-name
mapping when error_data is absent (older/edge frames).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tick the RefreshToken.scopes-removed breaking-change checkbox with the
code-commit SHA (5141f96ebe) and add the landing notes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 7's RefreshToken.scopes + websocket-dispatcher enforcement was
built for a sandbox->main websocket that never shipped, so no code path
ever exercised the scope check end-to-end. Revert the whole mechanism
from core HA and keep the sandbox on a plain system-user token.
Phase A (core revert, lockstep):
- auth/models.py: delete the RefreshToken.scopes field.
- auth/__init__.py + auth/auth_store.py: delete the scopes= parameter
and the on-disk serialize/deserialize of the scopes key. The load
path now pops a legacy scopes key silently (option A: no migration,
no storage-version bump) so pre-existing scoped tokens load fine.
- websocket_api/connection.py: delete self.scopes, the _scope_allows
helper, and the async_handle enforcement branch.
Phase B (sandbox helper):
- sandbox_v2/auth.py: delete SANDBOX_TOKEN_SCOPES; identify the refresh
token by the one-token-per-system-user invariant instead of matching
a scope set. System-user token type is unchanged.
Tests:
- Delete tests/components/websocket_api/test_scopes.py.
- Delete the scoped-token round-trip cases from tests/auth/test_init.py.
- Add a regression test that an on-disk token with a legacy scopes key
loads without error and drops the field.
- Update sandbox_v2 test_auth assertions to the plain-token contract.
Phase C (docs): mark auth-scoping-decision.md SUPERSEDED; drop the
auth row from the core-HA-modified lists in CLAUDE.md / architecture
.html; rewrite the scoped-auth sections in OVERVIEW.md and
architecture.html; add a re-introduce follow-up in FOLLOWUPS.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A2 landed (commit 4c85363668). Mark the monkey-patch-removed item done
in the batch landing tracker and check in the sub-session's STATUS report.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase A2 of plan-sandbox-context: remove the module-level `Store`
rebinding now that the `current_sandbox` contextvar (A1) is the single
source of truth for sandbox Store IO.
Load-bearing correctness fix (surfaced by A1's STATUS): the contextvar
save branch moves DOWN from `Store.async_save` to
`Store._async_write_data`. `async_delay_save` and the FINAL_WRITE flush
bypass `async_save` entirely — they funnel through
`_async_handle_write_data` -> `_async_write_data`. While `RemoteStore`
existed it overrode `_async_write_data` and masked this; deleting it
would have silently routed delayed/final-write saves to the sandbox
tempdir. Branching at `_async_write_data` covers async_save,
async_delay_save, and FINAL_WRITE uniformly. The redundant `async_save`
branch is removed.
Deletions:
- `hass_client/remote_store.py` (the subclass + installer)
- `hass_client/tests/test_remote_store.py` (covered by the contextvar
tests + the new delayed-save regression test)
- the `install_remote_store` call/teardown in `SandboxRuntime.run`
- the explicit `data.store` swap in `_load_restore_state` (the
contextvar reaches the import-captured `Store` reference)
New regression test `test_delayed_save_flushes_through_bridge` asserts
`async_delay_save` + EVENT_HOMEASSISTANT_FINAL_WRITE route through the
bridge. Docs (CLAUDE.md, OVERVIEW.md, FOLLOWUPS.md, architecture.html)
rewritten around the contextvar.
Tests: 190 core (sandbox_v2 + storage + restore_state) + 50 client all
green; prek clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three new plans queued ahead of fidelity/transport/ephemeral/docker:
- plan-sandbox-context: replace install_remote_store monkey-patch with a
current_sandbox ContextVar in homeassistant/helpers/, set by the runtime
before warm-load. Same primitive will later carry cross-sandbox IR/RF
calls. Refined via phx:plan; Q1/Q2/Q3 locked (defer IR/RF, A1+A2 split,
docstring+assertion guard).
- plan-strip-auth-scopes: revert Phase 7's RefreshToken.scopes mechanism
from core HA. No consumer shipped; on-disk scopes key dropped silently
on load. Re-introduces when the sandbox->main WS transport lands.
- plan-rename-sandbox (last): rename sandbox_v2 -> sandbox once v1 is fully
gone, including hassfest IGNORE cleanup.
Decisions locked 2026-06-03:
- builtin lockdown: (a) blanket ALWAYS_MAIN for Category A+B helpers.
- ephemeral-sources resolver: (c) generic resolver hook.
STATUS-plan-sandbox-context-A1.md added (sub-session report). The report
surfaced a correctness prerequisite for A2: async_delay_save and the
FINAL_WRITE flush bypass async_save and go through _async_write_data
directly. A2 must therefore move the contextvar save branch down to
_async_write_data before deleting RemoteStore, or delayed saves would
silently land in the sandbox tempdir. The plan's A2 section now spells
this out.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a `current_sandbox` ContextVar in core HA (homeassistant/helpers/
sandbox_context.py) that `Store.async_load/save/remove` read at call
time to route storage IO to main, replacing the module-level
`Store` rebinding done by `install_remote_store`. Reading the
contextvar inside each IO method is a single source of truth
regardless of how `Store` was imported, so it reaches the helpers
that captured the original `Store` at module load (restore_state,
the registries) — which the rebinding never could.
This is the additive half (Phase A1): the contextvar branch is added
alongside the existing `install_remote_store`, both paths active. The
contextvar branch is the first line of each IO method, so it serves
the IO; `RemoteStore` + the `_load_restore_state` workaround stay until
A2 deletes them once A1 bakes on dev.
- helpers/sandbox_context.py: `current_sandbox` ContextVar + the
`SandboxBridge` Protocol (store methods only; IR/RF deferred).
- helpers/storage.py: `_async_load_data` fetches the wrapped envelope
via the bridge when the contextvar is set (migration block unchanged
— design choice B); `async_save`/`async_remove` early-return through
the bridge.
- hass_client/sandbox_bridge.py: `ChannelSandboxBridge` implementing the
three store methods over MSG_STORE_LOAD/SAVE/REMOVE (bodies lifted from
RemoteStore, incl. the orjson preserialise on save).
- hass_client/sandbox.py: build the bridge and `current_sandbox.set`
before warm-load + handler registration; assert it was unset first
(Risk #3); reset the token on teardown.
- hass_client/tests/test_sandbox_bridge.py: the five Phase A1 tests plus
a direct ChannelSandboxBridge wire-mapping test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The numeric compat gate (Phase 17: 99.67% full sweep, 99.97% v1 baseline)
is met. Removing v1 ahead of the "v2 shipped a stable release" condition,
relying on git history for rollback.
Deletes homeassistant/components/sandbox, tests/components/sandbox, and the
top-level sandbox/ dev dir; regenerates config_flows.py (drops the v1
"sandbox" entry); updates current-state v2 docs (historical STATUS-phase-*
records left intact).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 7 introduced `SharingConfig` (`share_states`,
`share_entity_registry`, `share_areas`) on the runtime + the matching
`SandboxGroupConfig` on the manager + `--share-*` CLI flags +
`DEFAULT_GROUP_CONFIGS` defaults, intended for a future subscription
consumer that observes main's state stream. The consumer never landed.
~40 LOC of dead surface across five files plus an entire test module
(`test_sharing_config.py`, 7 tests). Carrying unwired flags risks
readers assuming functionality that isn't there — Phase 16's failure
categoriser had to specifically call this out.
Removed:
- `SharingConfig` + `sharing=` constructor param + `__all__` entry
(`sandbox_v2/hass_client/hass_client/sandbox.py`).
- `--share-states`/`--share-entity-registry`/`--share-areas` argparser
entries (`__main__.py`).
- `SandboxGroupConfig`, `DEFAULT_GROUP_CONFIGS`, `group_config()`
accessor, and `--share-*` argv expansion in `_default_command`
(`homeassistant/components/sandbox_v2/manager.py`).
- `sharing=` parameter on the in-process plugin.
- `test_sharing_config.py` (whole file).
- `test_manager.py` group_config tests.
- Sharing assertions in `test_sandbox_runtime.py`.
Replaced with `sandbox_v2/docs/design-share-states.md` — the contract
for the future consumer: goal, entity_id alignment constraint
(sandbox-side automations referencing `light.kitchen` must see main's
actual entity_id, not whatever the sandbox's local EntityRegistry
would have generated), `share/subscribe_*` mechanism sketch, per-
sandbox allow-list filtering on main, and the open questions
(direction, read-only semantics, device/area mirroring as P19
follow-on, fan-out perf).
`OVERVIEW.md`, `CLAUDE.md`, `docs/FOLLOWUPS.md`, and
`generate_backlog.py`'s `dependencies-not-shared` description all
repoint at the new design doc.
No core HA files touched. 140 + 47 tests passing (hass_client drops
the 7 sharing-config tests; HA-side drops 2 group_config tests).
plan.md updated with Phase 18/19/20 phase blocks + ✅ ticks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sandboxed entities that carry `device_info` now produce matching
`DeviceEntry` rows in main's `device_registry`, linked to the
sandboxed `config_entry_id`. Area assignment now propagates through
HA's standard device → entity inheritance path (Phase 5's entity
bridge alone left the entity registered without a device_id, so the
device_registry was empty for sandboxed integrations).
Sandbox side (`hass_client/entity_bridge.py`):
- `_serialise_device_info` flattens `DeviceInfo`'s TypedDict shapes
into JSON-safe lists/strings (identifiers/connections as lists of
two-element lists, via_device as list, entry_type as `StrEnum.value`,
configuration_url as string).
- `_describe_entity` appends a `device_info` key to the wire payload
when the entity exposes one.
Main side (`homeassistant/components/sandbox_v2/`):
- `SandboxEntityDescription` gains `device_info` / `device_id` fields.
- `from_payload` runs `_deserialise_device_info` to rebuild typed shapes.
- `_handle_register_entity` pre-creates the `DeviceEntry` via
`dr.async_get_or_create(config_entry_id=description.entry_id,
**device_info)`, pins the returned `device.id` on the description.
- Proxy base sets `_attr_device_info` so `EntityPlatform.async_add_entities`
reuses the same `DeviceEntry` (idempotent on identifiers/connections)
and wires `entity.device_entry`. No per-domain proxy edit needed —
all 32 inherit from the base.
No new core HA changes (`device_registry.async_get_or_create` is
already public).
Tests:
- `tests/components/sandbox_v2/test_phase19_devices.py` — six end-to-
end cases (DeviceEntry creation + entry-id linkage, proxy device_id
propagation, backwards-compat with payloads omitting device_info,
area assignment surfacing, invalid device_info rejection, payload
round-trip).
- `sandbox_v2/hass_client/tests/test_entity_bridge.py` — three new
cases.
140 + 54 tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ESPHome serial / BLE proxy (and Broadlink-style IR/RF) are coupled
in-process today: setup-time enumeration + send-calls happen via
Python calls/events the bridge doesn't cross. Pure-built-in pairs are
fine (same `built-in` sandbox group); a built-in producer paired with
a custom-integration consumer would split across `built-in`/`custom`
and break.
Captured the constraint + two fix shapes (classifier "co-locate with
X" hint vs extending Phase 6's event mirror beyond `<owned_domain>_*`)
in the three places that track open follow-ups:
- `sandbox_v2/CLAUDE.md` — Open follow-ups
- `sandbox_v2/docs/FOLLOWUPS.md` — Still open
- `sandbox_v2/OVERVIEW.md` — Where the design is still open
IR/RF is the simpler case (one-way command flow, no bidirectional
stream or enumeration) but still needs dedicated cross-sandbox routing
to land the consumer's send-call on the producer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The client-library side of sandbox v2, plus the full architecture +
phase-by-phase narrative + per-failure compat tooling.
`sandbox_v2/hass_client/` is a separate uv-managed Python package that
the HA-core sandbox_v2 integration spawns as a subprocess per sandbox
group. It hosts a private `HomeAssistant`, drives each sandboxed
integration's `ConfigFlow` and `async_setup_entry`, mirrors entity /
service / event registrations back to main over a stdio JSON-line
`Channel`, and routes Store reads/writes through main via `RemoteStore`.
`sandbox_v2/docs/`:
- `entity-bridge-decision.md` — Phase 1 spike: why Option B
(action-call forwarding via `sandbox_v2/call_service`).
- `auth-scoping-decision.md` — Phase 7: why `RefreshToken.scopes` is
a generic primitive (vs a sandbox-private subclass).
- `FOLLOWUPS.md` — narrative of Phases 12–17 (concurrent dispatcher,
28-domain proxy fill-in, flow-schema bridge, baseline compat sweep,
cross-integration BACKLOG generation, `ConfigEntry.sandbox` field).
Compat sweep tooling:
- `run_compat.py` — Phase 15: v1's 37-integration baseline runner;
output to `COMPAT.md` (curated) + `COMPAT.csv`.
- `run_compat_full.py` — Phase 16: 807-integration cross-sweep at
asyncio concurrency=6 (~12 min wall); output to `COMPAT_FULL.md`
+ `COMPAT_FULL.csv`.
- `categorize_failures.py` — regex-rule failure categoriser feeding
`BACKLOG.md` + `BACKLOG_FAILURES.json`.
- `generate_backlog.py` — auto-draft skeleton for BACKLOG.md.
Headline result (after Phase 17): 99.67% test-level pass rate across
807 integrations; baseline 99.97%. Both clear the 99.5% v1-removal
threshold.
`sandbox_v2/STATUS-phase-{3..18}.md` are the authoritative landing
notes for each phase — every "Things to flag" surfaced is in there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The HA-core side of the sandbox v2 rewrite: routing, lifecycle, flow
forwarding, entity bridging, service/event mirroring, scoped auth,
opt-in data sharing, Store routing, graceful shutdown.
Lives at `homeassistant/components/sandbox_v2/`. Designed alongside the
client library at `sandbox_v2/`; see `sandbox_v2/OVERVIEW.md` for the
full architecture and `sandbox_v2/docs/FOLLOWUPS.md` for the phase-by-
phase narrative.
Built on the core hooks added in the preceding commits:
`ConfigEntries.router` + `ConfigEntry.sandbox` + `RefreshToken.scopes`
+ `EntityComponent.async_register_remote_platform`.
32 domain proxy classes under `entity/` cover every entity domain v2
supports. Bridge translates each proxy method into a
`sandbox_v2/call_service` RPC via a per-loop-tick batcher (coalesces
multi-entity area calls into single RPCs).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an `IGNORE_INTEGRATIONS_WITH_ERRORS` set to hassfest's main loop
so v1 sandbox's pre-existing hassfest gates (CONFIG_SCHEMA, manifest
version, missing services.yaml, mypy signature drift in entity proxies)
don't block validation of the rest of the tree. v1 is being superseded
by sandbox_v2 (see `sandbox_v2/OVERVIEW.md`) — accepting v1's existing
state for now is preferable to either fixing every gate in code that
will be removed, or skipping hooks.
Also adds `sandbox_v2` to `NO_QUALITY_SCALE` (internal integration)
and ships an empty `sandbox_v2/services.yaml` placeholder — `bridge.py`
calls `hass.services.async_register` dynamically per sandboxed
integration; those services are owned by the sandboxed integrations.
`homeassistant/generated/config_flows.py` is regenerated to include
`sandbox` (v1 had drifted out of the registry).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small additive surfaces that the sandbox_v2 integration plugs
into. Each is additive and a no-op when nothing registers against it.
config_entries.py:
- `ConfigEntries.router: ConfigEntryRouter | None` attribute + the
`ConfigEntryRouter` Protocol. Consulted from three sites:
`ConfigEntriesFlowManager.async_create_flow`, `ConfigEntries.async_setup`,
and `ConfigEntries.async_unload`. Returning `None` falls through to
the existing path.
- `ConfigEntry.sandbox: str | None` optional field. Carries the routing
tag without polluting `entry.data`. Persisted via `as_dict` /
`as_storage_fragment` only when non-None; read via `dict.get` so
pre-existing stored entries load with `sandbox=None`. Mutable via
`ConfigEntries.async_update_entry(entry, sandbox=)`. `ConfigFlowResult`
gains a `sandbox` TypedDict key the framework reads at entry
construction (same plumbing shape as `minor_version` / `options` /
`subentries`).
entity_component.py:
- `EntityComponent.async_register_remote_platform(config_entry, platform)`
lets sandbox_v2 attach a pre-built remote `EntityPlatform` without
re-discovering the local integration. Mirrors `async_setup_entry`'s
`_platforms[entry_id] = platform` assignment as a public hook.
Tests:
- `MockConfigEntry` picks up a `sandbox=` kwarg threaded through to
`ConfigEntry.__init__`.
- Six new `test_config_entries.py` cases for the `sandbox` field:
default-none + omitted-from-storage, persisted-when-set, round-trip,
absent-from-storage-loads-as-none, async_update_entry-sets-sandbox,
cannot-be-set-directly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an optional `scopes: frozenset[str] | None` attribute to
`RefreshToken` and threads it through `AuthManager.async_create_refresh_token`
and `AuthStore` (sorted list on disk, optional on read — no version bump).
`ActiveConnection` reads scopes off the connecting token and a new
`_scope_allows` helper in the websocket dispatcher rejects out-of-scope
commands with `ERR_UNAUTHORIZED`. Existing unscoped tokens (`scopes is
None`) are unaffected — the gate is a no-op for them.
This is the primitive the sandbox_v2 integration uses to issue
namespace-scoped tokens (`{"sandbox_v2/", "auth/current_user"}`) to
sandbox subprocesses, so a sandbox-resident integration cannot escalate
to the rest of the websocket API.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After 'uv pip install -r requirements_ha.txt' (which pulls in
requirements_all.txt), the integrations previously listed as
'Not Tested (missing dependencies)' import and run:
- rest: 10/10 pass (needed xmltodict)
- logbook: 55/55 pass (needed sqlalchemy + numpy + turbojpeg)
- command_line: 7/7 pass
- trend: 9/9 pass
Promote them into the main pass table; the totals now read 35 of 37
fully pass, 955/957 tests (99.8%).
conversation imports too (hassil was already in pyproject.toml deps
but the report listed it as missing) but 8 of 21 tests fail and the
run deadlocks at tests 20-21 — moved into a new 'Newly runnable, still
investigating' section instead of the pass table.
Add a Setup section pointing at requirements_ha.txt and the pyitachip2ir
macOS caveat.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The sandbox client's pyproject.toml only carries the minimal set of
packages needed to run the client library and its own tests. Running
HA Core's per-integration test suites through the sandbox plugin needs
the full integration dependency tree (hassil for conversation,
xmltodict for rest, sqlalchemy+numpy+turbojpeg for logbook, …).
requirements_ha.txt pulls in ../../requirements_all.txt and
../../requirements_test.txt with paths relative to the file, so it
keeps working from any cwd. Comment notes the macOS pyitachip2ir
build caveat and the workaround.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The shell version required a manually-prepared
/tmp/all_integrations.txt and used a perl-based timeout shim.
run_all_sandbox_tests.py auto-discovers integrations from the core
tests directory and uses subprocess timeouts, so the .sh is no longer
needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
architecture.html already covers system diagrams, flow diagrams, file
structure, websocket API, key classes, and test results, so the prose
deep-dive in ARCHITECTURE.md was largely overlapping. Keep the bits
that weren't already in OVERVIEW.md and drop the rest:
- Startup sequence (host startup, sandbox process startup, host/sandbox
entity platform setup) as a new section after High-Level Flow.
- The RemoteLightEntity worked example plus the static/dynamic property
caching rationale, inside Entity Platform Architecture.
- Entity Method Compatibility (which domains already expose async
wrappers; the cover.toggle gap).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pointers to OVERVIEW.md, ARCHITECTURE.md, architecture.html, the
test driver scripts, and SANDBOX_COMPAT.md; quick-start for running
the sandbox client and the core test suites through it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move ARCHITECTURE.md, OVERVIEW.md, CLAUDE.md, the architecture HTML,
the test-runner scripts and TEST_RESULTS.csv into this directory next
to the hass_client subtree, so the entire sandbox project lives on the
sandbox branch of core (only the HA integration at
homeassistant/components/sandbox/ stays put for HA's loader).
Adjust the relative paths the moved files used to point at the old
sibling checkouts:
- hass_client/pyproject.toml: uv source homeassistant -> ../..
- run_all_sandbox_tests.{py,sh}: cd into ./hass_client and walk to
../../tests/components/ for the core test suites
- analyze_failures.py: write TEST_RESULTS.csv next to the script
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructure the sandbox websocket API around three commands instead of
a single event subscription: sandbox/register_service registers a
proxy service on the host that forwards calls into the sandbox,
sandbox/call_service lets the sandbox invoke a host service while
preserving its context, and sandbox/service_call_result returns the
sandbox's response back to the originating host caller.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These 32 files (light.py, sensor.py, etc.) each only registered an
async_add_entities callback. Now that RemoteHostEntityPlatform adds
proxy entities directly to the EntityComponent, they are dead code.
Also removes the unused register_platform_callback and
AddEntitiesCallback from SandboxEntityManager.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the async_forward_entry_setups + per-domain platform file
approach with RemoteHostEntityPlatform. This EntityPlatform subclass
is added directly to the domain's EntityComponent and manages proxy
entities without needing 32 identical platform files.
The platform is created on-demand when the first entity for a domain
is registered by the sandbox.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Config entries can now set options["sandbox"] = "group_name" to be
assigned to a named sandbox group. Entries sharing the same group
string run in the same sandbox process. The sandbox config entry
discovers group members via entry.data["group"].
The explicit entries list (entry.data["entries"]) still works for
test infrastructure compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split the monolithic entity.py (1900 lines) into a per-platform
package structure under entity/. Each domain gets its own file,
making the codebase easier to navigate and extend.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Brings total supported platforms to 32. Device tracker supports
both TrackerEntity (GPS) and ScannerEntity (router/BLE).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements sandbox proxy entities for: alarm_control_panel, button,
calendar, climate, cover, date, datetime, fan, humidifier, lawn_mower,
lock, media_player, notify, number, remote, select, siren, text, time,
update, vacuum, valve, water_heater, weather.
Total supported platforms: 30 (up from 6).
Each proxy class caches state from sandbox pushes and forwards service
calls back to the sandbox via the existing websocket command channel.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds SandboxBinarySensorEntity, SandboxSensorEntity, SandboxSwitchEntity,
SandboxSceneEntity, and SandboxEventEntity proxy classes. Also adds
device_class and state_class to entity registration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the sandbox integration that manages config entries running
in isolated processes. Proxy entities on the host forward service calls
to sandbox processes via websocket and cache state pushed back.
Supports entity, device, and area targeting for service calls.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move HybridServiceRegistry out of runtime.py into its own
sandbox_service_registry.py module, expand the websocket API error
translator to handle ServiceNotSupported and sandbox/call_service, and
extend conftest_sandbox with additional fixtures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These 32 files (light.py, sensor.py, etc.) each only registered an
async_add_entities callback. Now that RemoteHostEntityPlatform adds
proxy entities directly to the EntityComponent, they are dead code.
Also removes the unused register_platform_callback from
SandboxEntityManager.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New class that wraps an EntityPlatform on the sandbox side to intercept
async_add_entities calls. When an integration adds entities, they are:
1. Added locally as normal
2. Registered with the host via sandbox/register_entity
3. State changes forwarded to the host
4. Method calls from the host dispatched to local entities
This replaces the post-setup iteration approach in SandboxEntityBridge
with a clean intercept at the async_add_entities boundary.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes:
- Stop host HA explicitly after tests to cancel lingering timers that
caused verify_cleanup teardown errors (scene, todo, etc.)
- Guard HybridServiceRegistry remote fallback: only try remote for
services that exist in the remote cache, preventing wrong
ServiceNotFound errors in nested service calls
- Remove manual INSTANCES.remove; let async_stop handle cleanup
31 of 33 integrations fully pass (878/880 tests, 99.8%).
The 2 remaining failures are pre-existing logbook platform issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests using pytest-freezer's `freezer.move_to()` hang when a live
websocket is active because time jumps break async heartbeat timers.
Detect the freezer fixture in pytest_runtest_setup and fall back to
the base plugin (no websocket) for those tests.
All 9 input helper integrations now pass (189/189 tests).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New pytest plugin (hass_client.testing.conftest_sandbox) that boots a host
HA Core with websocket_api + sandbox integration, creates a sandbox auth
token, and connects a RemoteHomeAssistant to it via a live websocket. This
allows running the full HA Core input_boolean test suite (16/16 tests)
through a real sandbox round-trip.
Key pieces:
- conftest_sandbox.py: pytest plugin that patches async_test_home_assistant
to create host + sandbox HA instances with real TCP websocket
- conftest.py: adds core/tests to sys.path for test infrastructure imports
- pyproject.toml: point homeassistant dep at local core checkout, add test deps
Usage: pytest -p hass_client.testing.conftest_sandbox \
../core/tests/components/input_boolean/test_init.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SandboxClient connects to HA Core via a sandbox token, fetches assigned
config entries, sets up input helper integrations locally, registers
entities back to the host, pushes state changes, and subscribes to
service call forwarding.
Three e2e tests validate: token/instance creation, state updates, and
unload cleanup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add sandbox API methods to HomeAssistantAPI for communicating with HA Core's
sandbox integration: get_entries, update_entry, register/update/remove device,
register/update/remove entity, update_state, and subscribe_service_calls.
Override __new__ on RemoteHomeAssistant to accept extra keyword arguments,
since HomeAssistant.__new__ has a strict (config_dir: str) signature that
rejects the remote_config kwarg in Python 3.14.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@@ -24,6 +24,7 @@ The following platforms have extra guidelines:
## Entity platforms
- Ensure `async_added_to_hass()` and `async_will_remove_from_hass()` have symmetrical behavior. For example, if a subscription is created in `async_added_to_hass()`, it should be unsubscribed in `async_will_remove_from_hass()`. Also, if something is torn down in `async_will_remove_from_hass()`, it should be set up in `async_added_to_hass()`.
- Entity base class (e.g. `SensorEntity`, `TrackerEntity`) provide a stable API for child classes to inherit from. Do not suggest redeclaring or duplicating attributes, properties, or methods the base class already provides, and do not add guards against the parent's behavior changing — rely on the base class instead.
description: Reviews code changes and provides constructive feedback. Should be used when a review is requested to provide a consistent review behavior and output format. This skill can be used for code reviews in general, not just for GitHub pull requests.
---
# Review Code Changes
## Analyze the code changes for:
- Code quality and style consistency
- Potential bugs or issues
- Performance implications
- Security concerns
- Test coverage
- Documentation updates if needed
## Verification:
- After the review, run parallel subagents for each finding to double-check it.
- Spawn up to a maximum of 10 parallel subagents at a time.
- Gather the results from the subagents and summarize them in the final review comments.
## IMPORTANT:
- Just review. DO NOT make any changes.
- Be constructive and specific in your comments.
- Suggest improvements where appropriate.
- No need to run tests or linters, just review the code changes.
- No need to highlight things that are already good.
## Output format:
- List specific comments for each file/line that needs attention.
- In the end, summarize with an overall assessment (approve, request changes, or comment) and bullet point list of changes suggested, if any.
@@ -43,6 +43,7 @@ This repository contains the core of Home Assistant, a Python 3 based home autom
- Avoid using conditions/branching in tests. Instead, either split tests or adjust the test parametrization to cover all cases without branching.
- If multiple tests share most of their code, use `pytest.mark.parametrize` to merge them into a single parameterized test instead of duplicating the body. Use `pytest.param` with an `id` parameter to name the test cases clearly.
- We use Syrupy for snapshot testing. Leverage `.ambr` snapshots instead of repetitive and exhaustive generation of test data within Python code itself.
- Hardcoded `entity_id`s in tests are fine. If the same one is repeated, use a constant.
@@ -27,6 +27,7 @@ The following platforms have extra guidelines:
## Entity platforms
- Ensure `async_added_to_hass()` and `async_will_remove_from_hass()` have symmetrical behavior. For example, if a subscription is created in `async_added_to_hass()`, it should be unsubscribed in `async_will_remove_from_hass()`. Also, if something is torn down in `async_will_remove_from_hass()`, it should be set up in `async_added_to_hass()`.
- Entity base class (e.g. `SensorEntity`, `TrackerEntity`) provide a stable API for child classes to inherit from. Do not suggest redeclaring or duplicating attributes, properties, or methods the base class already provides, and do not add guards against the parent's behavior changing — rely on the base class instead.
@@ -33,6 +33,7 @@ This repository contains the core of Home Assistant, a Python 3 based home autom
- Avoid using conditions/branching in tests. Instead, either split tests or adjust the test parametrization to cover all cases without branching.
- If multiple tests share most of their code, use `pytest.mark.parametrize` to merge them into a single parameterized test instead of duplicating the body. Use `pytest.param` with an `id` parameter to name the test cases clearly.
- We use Syrupy for snapshot testing. Leverage `.ambr` snapshots instead of repetitive and exhaustive generation of test data within Python code itself.
- Hardcoded `entity_id`s in tests are fine. If the same one is repeated, use a constant.
"message":"Supervisor was not ready during setup, will retry"
}
},
"preview_features":{
"snapshots":{
"description":"We're creating the [Open Home Foundation Device Database](https://www.home-assistant.io/blog/2026/02/02/about-device-database/): a free, open source community-powered resource to help users find practical information about how smart home devices perform in real installations.\n\nYou can help us build it by opting in to share anonymized data about your devices. This data will only ever include device-specific details (like model or manufacturer) – never personally identifying information (like the names you assign).\n\nFind out how we process your data (should you choose to contribute) in our [Data Use Statement](https://www.openhomefoundation.org/device-database-data-use-statement).",
"description":"The scanner entity `{entity_id}` is associated with the zone `{zone}`, but that zone has been removed.\n\nTo fix this, reconfigure the scanner to use a different zone or recreate the missing zone.",
"title":"Scanner is associated with a removed zone"
}
},
"services":{
"see":{
"description":"Manually update the records of a seen legacy device tracker in the known_devices.yaml file.",
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.