[V1] Models - Guardrails: Latest Message Evaluation (Phase 2)

## Summary

Add `guardLatestUserMessage` guardrail evaluation support to the Bedrock model provider. This feature allows evaluating only the latest user message with guardrails instead of the entire conversation. This is Phase 2 of the guardrails implementation (parent issue: #484).

## Usage

```typescript
import { BedrockModel } from '@strands-agents/sdk/models'

const model = new BedrockModel({
  modelId: 'us.anthropic.claude-sonnet-4-20250514-v1:0',
  guardrailConfig: {
    guardrailIdentifier: 'my-guardrail-id',
    guardrailVersion: '1',
    trace: 'enabled',
    // Only evaluate the latest user message
    guardLatestUserMessage: true,
  },
})
```

## Background

When `guardLatestUserMessage` is enabled, only the most recent user message is sent to guardrails for evaluation instead of the entire conversation. This can:
- Improve performance in multi-turn conversations
- Reduce costs (fewer tokens evaluated)
- Avoid re-evaluating messages that have already been validated

The implementation wraps the latest user message content in `guardContent` blocks, which signals to Bedrock's guardrails to evaluate only that content.

---

## Implementation Requirements

> **Note:** The implementation approach has been corrected to match the Python SDK after review (see [sdk-python/src/strands/models/bedrock.py](https://github.com/strands-agents/sdk-python/blob/main/src/strands/models/bedrock.py) lines 368-446).

### 1. Extended GuardrailConfig

Add `guardLatestUserMessage` option to `BedrockGuardrailConfig` (in `bedrock.ts`):

```typescript
export interface BedrockGuardrailConfig {
  // ... existing options from Phase 1 ...
  
  /**
   * Only evaluate the latest user message with guardrails.
   * When true, wraps the latest user message's text/image content in guardContent blocks.
   * This can improve performance and reduce costs in multi-turn conversations.
   * 
   * @remarks
   * The implementation finds the last user message containing text or image content
   * (not just the last message), ensuring correct behavior during tool execution cycles
   * where toolResult messages may follow the user's actual input.
   * 
   * @defaultValue false
   */
  guardLatestUserMessage?: boolean
}
```

### 2. Helper Method: Find Last User Text/Image Message Index

Add a private helper method to find the correct message to wrap:

```typescript
/**
 * Find the index of the last user message containing text or image content.
 * 
 * This is used for guardLatestUserMessage guardrail evaluation to ensure that guardContent 
 * wrapping targets the correct message even when toolResult messages (role='user') follow 
 * the actual user text/image input during tool execution cycles.
 * 
 * @param messages - Array of messages to search
 * @returns Index of the last user message with text/image content, or undefined if not found
 */
private _findLastUserTextMessageIndex(messages: Message[]): number | undefined {
  for (let idx = messages.length - 1; idx >= 0; idx--) {
    const msg = messages[idx]
    if (
      msg.role === 'user' &&
      msg.content.some((block) => block.type === 'textBlock' || block.type === 'imageBlock')
    ) {
      return idx
    }
  }
  return undefined
}
```

### 3. Message Formatting for Latest Message

Update `_formatMessages` to wrap the latest user message in `guardContent` blocks when enabled:

```typescript
private _formatMessages(messages: Message[]): BedrockMessage[] {
  // Pre-compute the index of the last user message containing text/image content
  // This ensures guardContent wrapping is maintained across tool execution cycles
  const lastUserTextIdx = this._config.guardrailConfig?.guardLatestUserMessage
    ? this._findLastUserTextMessageIndex(messages)
    : undefined

  return messages.reduce<BedrockMessage[]>((acc, message, idx) => {
    const content = message.content
      .map((block) => {
        let formattedBlock = this._formatContentBlock(block)
        
        // Wrap in guardContent if this is the last user text/image message and guardLatestUserMessage is enabled
        if (idx === lastUserTextIdx && formattedBlock !== undefined) {
          if ('text' in formattedBlock) {
            formattedBlock = {
              guardContent: {
                text: {
                  text: formattedBlock.text as string,
                  qualifiers: [],
                },
              },
            }
          } else if ('image' in formattedBlock) {
            formattedBlock = {
              guardContent: {
                image: formattedBlock.image,
              },
            }
          }
          // Other content types (toolUse, toolResult, etc.) pass through unchanged
        }
        
        return formattedBlock
      })
      .filter((block) => block !== undefined)

    if (content.length > 0) {
      acc.push({ role: message.role, content })
    }

    return acc
  }, [])
}
```

### 4. Key Implementation Considerations

1. **Only `text` and `image` content blocks should be wrapped in `guardContent`**
   - `toolUse`, `toolResult`, `reasoningBlock`, `cachePointBlock`, etc. pass through unchanged

2. **The wrapping targets the last user message with text/image content**
   - NOT simply the last message in the array
   - This is critical for tool execution cycles where `toolResult` messages (role='user') follow the actual user input

3. **Existing `GuardContentBlock` in messages should be preserved as-is**
   - If a user explicitly provides `GuardContentBlock`, don't double-wrap

4. **Edge case: No user messages with text/image content**
   - If `_findLastUserTextMessageIndex` returns `undefined`, no wrapping occurs

### Files to Modify

1. **`src/models/bedrock.ts`**
   - Add `guardLatestUserMessage?: boolean` to `BedrockGuardrailConfig` interface
   - Add `_findLastUserTextMessageIndex()` private method
   - Update `_formatMessages()` to wrap latest user message content
   
2. **`src/models/__tests__/bedrock.test.ts`**
   - Test: guardLatestUserMessage wrapping text content
   - Test: guardLatestUserMessage wrapping image content
   - Test: guardLatestUserMessage with tool execution cycles (toolResult messages don't get wrapped)
   - Test: guardLatestUserMessage disabled (default behavior unchanged)
   - Test: Non-user messages not wrapped
   - Test: Multi-turn conversations
   - Test: Mixed content (text + toolResult in same conversation)
   - Test: Edge case - no user messages with text/image content

## Acceptance Criteria

- [ ] `guardLatestUserMessage` option added to `BedrockGuardrailConfig`
- [ ] When `guardLatestUserMessage: true`, latest user message text is wrapped in `guardContent`
- [ ] When `guardLatestUserMessage: true`, latest user message images are wrapped in `guardContent`
- [ ] `toolResult` messages are NOT wrapped (even though role='user')
- [ ] Non-user messages are not wrapped
- [ ] Only the correct "last user text/image message" is wrapped, not the last message
- [ ] Default behavior (`guardLatestUserMessage: false` or undefined) unchanged
- [ ] Existing explicit `GuardContentBlock` in messages preserved
- [ ] Unit tests cover all scenarios including tool execution cycles
- [ ] TSDoc comments updated with proper `@remarks` explaining the behavior

## Reference

- **Python SDK**: https://github.com/strands-agents/sdk-python/blob/main/src/strands/models/bedrock.py 
  - See `guardrail_latest_message` config option (lines 90, 116)
  - See `_find_last_user_text_message_index` method (lines 368-383)
  - See wrapping logic in `_format_bedrock_messages` (lines 413-445)
- Bedrock Guardrails: https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
- Parent Issue: #484

## Dependencies

- **Requires Phase 1 completion**: ✅ [V1] Models - Guardrails: Configuration & Redaction (Phase 1) - COMPLETE

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1] Models - Guardrails: Latest Message Evaluation (Phase 2) #565

Summary

Usage

Background

Implementation Requirements

1. Extended GuardrailConfig

2. Helper Method: Find Last User Text/Image Message Index

3. Message Formatting for Latest Message

4. Key Implementation Considerations

Files to Modify

Acceptance Criteria

Reference

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

[V1] Models - Guardrails: Latest Message Evaluation (Phase 2) #565

Description

Summary

Usage

Background

Implementation Requirements

1. Extended GuardrailConfig

2. Helper Method: Find Last User Text/Image Message Index

3. Message Formatting for Latest Message

4. Key Implementation Considerations

Files to Modify

Acceptance Criteria

Reference

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.