[Bug Report] Compatibility mode legacy hook aliases do not preserve HookedTransformer hook semantics

`TransformerBridge.enable_compatibility_mode()` appears to expose some legacy `HookedTransformer` hook names without preserving the same computation points.

In particular, the legacy aliases for attention input hooks seem to be semantically different from the legacy `HookedTransformer` hooks:

- `blocks.{layer}.hook_q_input`
- `blocks.{layer}.hook_k_input`
- `blocks.{layer}.hook_v_input`

My understanding is that in legacy `HookedTransformer`, these correspond to pre-LN residual-stream forks, while in current `TransformerBridge` compatibility mode they fire on post-LN tensors instead.

This means compatibility mode may currently provide:
- matching or near-matching logits,
- matching hook names,
- matching hook shapes,

while still not preserving the same hooked activations and backward gradients.

## Observed behavior

For GPT-2, the current behavior appears to be:

- logits are close,
- legacy hook aliases can be registered,
- hook shapes match,
- but Q/K/V input hook values do not match legacy `HookedTransformer`,
- and downstream attribution-style scores diverge substantially.

There may also be a similar issue for:

- `blocks.{layer}.hook_mlp_in`

## Checklist
- [x] I have checked that there is no similar [issue]

It seems to me like one of these should be made explicit:

1. Compatibility mode should preserve legacy hook semantics.
2. If exact semantic parity is not intended, the aliases/docs should say they are name/shape compatible rather than semantically equivalent.
3. Both should exist:
   - bridge-native canonical post-LN hooks,
   - legacy-compatible aliases for legacy pre-LN hook semantics.

My preference would be option 3, since it preserves bridge-native behavior while making compatibility mode meaningful for legacy tooling.

I’d be happy to work on this and open a PR if this direction sounds right to maintainers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] Compatibility mode legacy hook aliases do not preserve HookedTransformer hook semantics #1317

Observed behavior

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug Report] Compatibility mode legacy hook aliases do not preserve HookedTransformer hook semantics #1317

Description

Observed behavior

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions