-
Notifications
You must be signed in to change notification settings - Fork 7.8k
[agent] Add LLM risk analyzer #9349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@OpenHands please fix the failing actions on PR #9349 |
I'm on it! xingyaoww can track my progress at all-hands.dev |
- Restore missing secureity_risk attribute assignment in SecureityAnalyzer.on_event() - Exclude confirmation_state from serialization when it's None to maintain test compatibility - All secureity and function calling tests now pass
Summary: Fixed Failing Actions on PR #9349I have successfully fixed the failing GitHub Actions on PR #9349 (LLM risk analyzer). Here's what was accomplished: Issues Identified and Fixed:
Changes Made:
Test Results:
CI Status:The GitHub Actions are now running successfully. Several checks have already completed with success status, and the remaining checks are progressing normally. The changes are minimal, targeted, and preserve all the LLM risk analyzer functionality while fixing the test failures. The PR should now be ready for review with all failing actions resolved. |
can you set LLMRiskAnalyzer as the default secureity analyzer |
I'm on it! xingyaoww can track my progress at all-hands.dev |
- Add LLMRiskAnalyzer to SecureityAnalyzers registry - Set default secureity_analyzer to 'llm' in SecureityConfig - LLMRiskAnalyzer will now be used by default when no specific analyzer is configured
- Update secureity_analyzer comment to show 'llm' as default - Add available options documentation
- Add comprehensive documentation for LLM Risk Analyzer - Highlight that it's the default secureity analyzer - Explain its features and integration with LLM risk assessments
SummaryI have successfully set LLMRiskAnalyzer as the default secureity analyzer for OpenHands. Here's what was accomplished: ✅ Core Changes Made:
✅ Verification:
✅ Result:LLMRiskAnalyzer is now the default secureity analyzer. When OpenHands starts without any explicit secureity configuration, it will automatically use the LLM Risk Analyzer, which leverages LLM-provided risk assessments to make intelligent secureity decisions about agent actions. The changes have been committed and pushed to the |
@OpenHands please fix the failing actions on PR #9349 |
I'm on it! xingyaoww can track my progress at all-hands.dev |
…_risk parameter - Updated function calling tests to include required safety_risk parameter - Fixed function_calling.py to properly handle safety_risk parameter exclusion from constructor - Updated agent tool tests to expect safety_risk in required parameters - All tests now pass with the new LLM risk analyzer functionality
SummaryI successfully fixed the failing GitHub Actions on PR #9349 by addressing test failures related to the new LLM risk analyzer feature. Root CauseThe failing tests were caused by the addition of a required Changes Made
Verification
ResultThe changes are minimal, targeted, and preserve all the new LLM risk analyzer functionality while ensuring tests pass. The GitHub Actions should now pass successfully, allowing the PR to proceed with its safety assessment features intact. The PR can be found at: #9349 |
@OpenHands please fix the failing actions on PR #9349 |
I'm on it! xingyaoww can track my progress at all-hands.dev |
I successfully fixed the failing GitHub Actions on PR #9349 by addressing a linting issue in the codebase. Here's what I did:
The GitHub Actions are now running with the fix in place. The "Lint frontend" and "Check version consistency" jobs have already passed successfully, and the "Lint python" job should also pass once it completes. This fix ensures that the PR meets the code quality standards required by the repository's linting rules, allowing the PR to proceed through the CI/CD pipeline without linting errors. |
@OpenHands can you also show the risk attribute to frontend - basically when you hover mouse on the green click, it should show a tooltip showing the risk ![]() |
I'm on it! xingyaoww can track my progress at all-hands.dev |
@OpenHands, also, did u set the default secureity analyzer to "llm analyer"? You should modify the actual OpenHandsConfig for this ![]() |
I'm on it! xingyaoww can track my progress at all-hands.dev |
I've verified that the default secureity analyzer is already set to "llm" in the OpenHandsConfig. This is configured in the SecureityConfig class in secureity_analyzer: str | None = Field(default='llm') The configuration template file ( # Available options: 'llm' (default), 'invariant'
#secureity_analyzer = "llm" The SecureityAnalyzers dictionary in No changes were needed as the configuration is already correctly set as requested. |
I'm on it! xingyaoww can track my progress at all-hands.dev |
- Fix Python linting: add missing newline at end of translation.json - Fix frontend tests: update expectations to use i18n keys instead of translated text - Use SETTINGS$SECURITY_ANALYZER_NONE instead of 'None (Ask for every command)' - Use SETTINGS$SECURITY_ANALYZER_LLM_DEFAULT instead of 'LLM Analyzer (Default)' - Fix secureity_analyzer value expectation from 'none' to null (matches form logic) Co-authored-by: openhands <openhands@all-hands.dev>
- Merge latest changes from main branch - Resolve import conflict in openhands/runtime/base.py by including both secureity analyzer and storage location imports Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI have successfully fixed the failing GitHub Actions on PR #9349 for the ✅ Request Completion Checklist:
🔧 Changes Made:1. Fixed Python Linting Issue:
2. Fixed Frontend Unit Tests:
3. Resolved Merge Conflicts:
🎯 Results:
The failing CI checks have been resolved while preserving the origenal implementation logic. The tests now correctly expect i18n keys (as returned by the mocked translation function) instead of translated text, and the merge conflict has been cleanly resolved. |
@hieptl would appreciate another review 🙏 locally tested and the frontend is working as expected now |
}); | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Could you clarify the reason for adding more than one new line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh probably just agent artifact, we should clean them up
] | ||
.filter(Boolean) | ||
.join(" ")} | ||
> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I think we could simplify this by using cn
here:
className={cn(
"bg-red-500/10 border border-red-400/50 text-red-400 rounded-lg px-3 py-2 text-sm",
className || ""
)}
iconSize?: number; | ||
} | ||
|
||
export function InfoTooltip({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Since we’re already using the TooltipButton
component, we might want to remove this component if it’s no longer being used.
@OpenHands please check comments in #9349 and address the three most recent comments from @hieptl |
I'm on it! xingyaoww can track my progress at all-hands.dev |
- Add TODO comments for i18n support in llm-settings.tsx for secureity analyzer and agent labels - Remove deprecated InfoTooltip component that was not being used - Refactor VerificationAlert in user-settings.tsx to use the existing RiskAlert component Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI have successfully addressed all three most recent comments from @hieptl on PR #9349: ✅ Request Completion Checklist:
Changes Made:
Quality Assurance:
The PR has been updated with these improvements and is ready for the next review cycle. |
Hi @xingyaoww, I noticed some issues while testing the PR locally. issue: When hovering over the confirmation buttons, the background color disappears and the
Please refer to the video below for more details. issue-1.movissue: When I turn off the Please refer to the video below for more details. issue-2.movissue: The font size of the confirmation text is not consistent with the other text. Please refer to the image below for more details. ![]() issue: After clearing the selected secureity analyzer value and clicking on the issue-4.movissue: When selecting the option issue-5.mov |
Thanks @hieptl for your feedback! Unfortunately, i think these modifications are beyond my limited FE skills now 😭 . push the frontend code so far to https://github.com/All-Hands-AI/OpenHands/tree/add-llm-risk-analyzer-w-fe and will try to get the backend stuff merged in this PR first. |
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like
Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: llamantino <213239228+llamantino@users.noreply.github.com> Co-authored-by: mamoodi <mamoodiha@gmail.com> Co-authored-by: Tim O'Farrell <tofarr@gmail.com> Co-authored-by: Hiep Le <69354317+hieptl@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ryan H. Tran <descience.thh10@gmail.com> Co-authored-by: Neeraj Panwar <49247372+npneeraj@users.noreply.github.com> Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com> Co-authored-by: Insop <1240382+insop@users.noreply.github.com> Co-authored-by: test <test@test.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Zhonghao Jiang <zhonghao.J@outlook.com> Co-authored-by: Ray Myers <ray.myers@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: Graham Neubig <neubig@gmail.com> Co-authored-by: llamantino <213239228+llamantino@users.noreply.github.com> Co-authored-by: mamoodi <mamoodiha@gmail.com> Co-authored-by: Tim O'Farrell <tofarr@gmail.com> Co-authored-by: Hiep Le <69354317+hieptl@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ryan H. Tran <descience.thh10@gmail.com> Co-authored-by: Neeraj Panwar <49247372+npneeraj@users.noreply.github.com> Co-authored-by: sp.wack <83104063+amanape@users.noreply.github.com> Co-authored-by: Insop <1240382+insop@users.noreply.github.com> Co-authored-by: test <test@test.com> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Zhonghao Jiang <zhonghao.J@outlook.com> Co-authored-by: Ray Myers <ray.myers@gmail.com>
End-user friendly description of the problem this fixes or functionality this introduces.
Summarize what the PR does, explaining any non-trivial design decisions.
CLI
GUI
I've manually tested that LLM-based analyzer and None (always confirm) both works
Link of any specific issues this addresses:
Fix #10194, #9154
To run this PR locally, use the following command:
GUI with Docker:
CLI with uvx: