Skip to content

Commit 835e21c

Browse files
authored
Merge pull request NVIDIA-NeMo#384 from NVIDIA/feature/support-for-exceptions
Initial work on support for exceptions.
2 parents fd8e102 + e0f6909 commit 835e21c

File tree

14 files changed

+267
-31
lines changed

14 files changed

+267
-31
lines changed

docs/user_guides/configuration-guide.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -706,6 +706,71 @@ rails:
706706

707707
**IMPORTANT**: This is recommended only when enough examples are provided.
708708

709+
## Exceptions
710+
711+
NeMo Guardrails supports raising exceptions from within flows.
712+
An exception is an event whose name ends with `Exception`, e.g., `InputRailException`.
713+
When an exception is raised, the final output is a message with the role set to `exception` and the content
714+
set to additional information about the exception. For example:
715+
716+
```colang
717+
define flow input rail example
718+
# ...
719+
create event InputRailException(message="Input not allowed.")
720+
```
721+
722+
```json
723+
{
724+
"role": "exception",
725+
"content": {
726+
"type": "InputRailException",
727+
"uid": "45a452fa-588e-49a5-af7a-0bab5234dcc3",
728+
"event_created_at": "9999-99-99999:24:30.093749+00:00",
729+
"source_uid": "NeMoGuardrails",
730+
"message": "Input not allowed."
731+
}
732+
}
733+
```
734+
735+
### Guardrails Library Exception
736+
737+
By default, all the guardrails included in the [Guardrails Library](./guardrails-library.md) return a predefined message
738+
when a rail is triggered. You can change this behavior by setting the `enable_rails_exceptions` key to `True` in your
739+
`config.yml` file:
740+
741+
```yaml
742+
enable_rails_exceptions: True
743+
```
744+
745+
When this setting is enabled, the rails are triggered, they will return an exception message.
746+
To understand better what is happening under the hood, here's how the `self check input` rail is implemented:
747+
748+
```colang
749+
define flow self check input
750+
$allowed = execute self_check_input
751+
if not $allowed
752+
if $config.enable_rails_exceptions
753+
create event InputRailException(message="Input not allowed. The input was blocked by the 'self check input' flow.")
754+
else
755+
bot refuse to respond
756+
stop
757+
```
758+
759+
When the `self check input` rail is triggered, the following exception is returned.
760+
761+
```json
762+
{
763+
"role": "exception",
764+
"content": {
765+
"type": "InputRailException",
766+
"uid": "45a452fa-588e-49a5-af7a-0bab5234dcc3",
767+
"event_created_at": "9999-99-99999:24:30.093749+00:00",
768+
"source_uid": "NeMoGuardrails",
769+
"message": "Input not allowed. The input was blocked by the 'self check input' flow."
770+
}
771+
}
772+
```
773+
709774
## Knowledge base Documents
710775

711776
By default, an `LLMRails` instance supports using a set of documents as context for generating the bot responses. To include documents as part of your knowledge base, you must place them in the `kb` folder inside your config folder:

examples/configs/guardrails_only/input/config.co

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,15 @@ define bot deny
77
define subflow dummy input rail
88
"""A dummy input rail which checks if the word "dummy" is included in the text."""
99
if "dummy" in $user_message
10-
bot deny
10+
if $config.enable_rails_exceptions
11+
create event DummyInputRailException(message="Dummy input detected. The user's message contains the word 'dummy'.")
12+
else
13+
bot deny
1114
stop
1215

1316
define subflow allow input
14-
bot allow
17+
if $config.enable_rails_exceptions
18+
create event AllowInputRailException(message="Allow input triggered. The bot will respond with 'ALLOW'.")
19+
else
20+
bot allow
1521
stop

examples/configs/guardrails_only/output/config.co

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,17 @@ define bot deny
1212
"DENY"
1313

1414
define subflow dummy output rail
15-
"""A dummy input rail which checks if the word "dummy" is included in the text."""
15+
"""A dummy output rail which checks if the word "dummy" is included in the text."""
1616
if "dummy" in $bot_message
17-
bot deny
17+
if $config.enable_rails_exceptions
18+
create event DummyOutputRailException(message="Dummy output detected. The bot's message contains the word 'dummy'.")
19+
else
20+
bot deny
1821
stop
1922

2023
define subflow allow output
21-
bot allow
24+
if $config.enable_rails_exceptions
25+
create event AllowOutputRailException(message="Allow output triggered. The bot will respond with 'ALLOW'. To see it in action you must disable the `enable_rails_exceptions` configuration setting.")
26+
else
27+
bot allow
2228
stop

examples/configs/rag/fact_checking/rails/factcheck.co

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,21 +24,28 @@ define subflow check facts
2424

2525
$accuracy = execute check_facts
2626
if $accuracy < 0.4
27-
bot inform answer unknown
27+
if $config.enable_rails_exceptions
28+
create event FactCheckLowAccuracyRailException(message="Fact check triggered. The accuracy of the response is below 0.4.")
29+
else
30+
bot inform answer unknown
2831
stop
2932

3033
if $accuracy < 0.6
3134
# We need to provide a warning in this case
35+
# TODO: Add a warning message
3236
$bot_message_potentially_inaccurate = True
3337

34-
3538
define flow flag potentially inaccurate response
3639
"""Tell the user that the previous answer is potentially inaccurate."""
3740
bot ...
3841

3942
if $bot_message_potentially_inaccurate
4043
$bot_message_potentially_inaccurate = False
41-
bot inform answer potentially inaccurate
44+
if $config.enable_rails_exceptions
45+
create event PotentiallyInaccurateResponseRailException(message="Potentially inaccurate response detected. The bot's response may be inaccurate.")
46+
else
47+
bot inform answer potentially inaccurate
48+
stop
4249

4350
define bot inform answer potentially inaccurate
4451
"Attention: the answer above is potentially inaccurate."

nemoguardrails/library/activefence/flows.co

Lines changed: 36 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -18,46 +18,72 @@ define subflow activefence moderation
1818
$result = execute call activefence api
1919

2020
if $result.max_risk_score > 0.7
21-
bot refuse to respond
21+
if $config.enable_rails_exceptions
22+
create event ActiveFenceModerationRailException(message="ActiveFence moderation triggered. The maximum risk score exceeded the threshold.")
23+
else
24+
bot refuse to respond
2225
stop
2326

2427
define subflow activefence moderation detailed
2528
"""Guardrail based on individual risk scores."""
2629
$result = execute call activefence api
2730

2831
if $result.violations.get("abusive_or_harmful.harassment_or_bullying", 0) > 0.8
29-
bot inform cannot engage in abusive or harmful behavior
32+
if $config.enable_rails_exceptions
33+
create event ActiveFenceHarassmentRailException(message="ActiveFence moderation triggered. The harassment or bullying risk score exceeded the threshold.")
34+
else
35+
bot inform cannot engage in abusive or harmful behavior
3036
stop
3137

3238
if $result.violations.get("abusive_or_harmful.profanity", 0) > 0.7
33-
bot inform cannot engage in abusive or harmful behavior
39+
if $config.enable_rails_exceptions
40+
create event ActiveFenceProfanityRailException(message="ActiveFence moderation triggered. The profanity risk score exceeded the threshold.")
41+
else
42+
bot inform cannot engage in abusive or harmful behavior
3443
stop
3544

3645
if $result.violations.get("abusive_or_harmful.hate_speech", 0) > 0.6
37-
bot inform cannot engage in abusive or harmful behavior
46+
if $config.enable_rails_exceptions
47+
create event ActiveFenceHateSpeechRailException(message="ActiveFence moderation triggered. The hate speech risk score exceeded the threshold.")
48+
else
49+
bot inform cannot engage in abusive or harmful behavior
3850
stop
3951

4052
if $result.violations.get("abusive_or_harmful.child_grooming", 0) > 0.4
41-
bot inform cannot engage in abusive or harmful behavior
53+
if $config.enable_rails_exceptions
54+
create event ActiveFenceChildGroomingRailException(message="ActiveFence moderation triggered. The child grooming risk score exceeded the threshold.")
55+
else
56+
bot inform cannot engage in abusive or harmful behavior
4257
stop
4358

4459
if $result.violations.get("abusive_or_harmful.general_violence", 0) > 0.7
45-
bot inform cannot engage in abusive or harmful behavior
60+
if $config.enable_rails_exceptions
61+
create event ActiveFenceGeneralViolenceRailException(message="ActiveFence moderation triggered. The general violence risk score exceeded the threshold.")
62+
else
63+
bot inform cannot engage in abusive or harmful behavior
4664
stop
4765

4866
if $result.violations.get("self_harm.general", 0) > 0.8
49-
bot inform cannot engage in self harm behavior
67+
if $config.enable_rails_exceptions
68+
create event ActiveFenceSelfHarmRailException(message="ActiveFence moderation triggered. The self harm risk score exceeded the threshold.")
69+
else
70+
bot inform cannot engage in self harm behavior
5071
stop
5172

5273
if $result.violations.get("adult_content.general", 0) > 0.3
53-
bot inform cannot engage with inappropriate content
74+
if $config.enable_rails_exceptions
75+
create event ActiveFenceAdultContentRailException(message="ActiveFence moderation triggered. The adult content risk score exceeded the threshold.")
76+
else
77+
bot inform cannot engage with inappropriate content
5478
stop
5579

5680
if $result.violations.get("privacy_violation.pii", 0) > 0.8
57-
bot inform cannot engage with sensitive content
81+
if $config.enable_rails_exceptions
82+
create event ActiveFencePrivacyViolationRailException(message="ActiveFence moderation triggered. The privacy violation risk score exceeded the threshold.")
83+
else
84+
bot inform cannot engage with sensitive content
5885
stop
5986

60-
6187
define bot inform cannot engage in abusive or harmful behavior
6288
"I will not engage in any abusive or harmful behavior."
6389

nemoguardrails/library/hallucination/flows.co

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,5 +23,8 @@ define subflow self check hallucination
2323
$check_hallucination = False
2424

2525
if $is_hallucination
26-
bot inform answer unknown
26+
if $config.enable_rails_exceptions
27+
create event CheckHallucinationRailException(message="Hallucination detected. The previous answer may not be accurate")
28+
else
29+
bot inform answer unknown
2730
stop

nemoguardrails/library/jailbreak_detection/flows.co

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
1+
define bot refuse to respond
2+
"I'm sorry, I can't respond to that."
3+
14
define subflow jailbreak detection heuristics
25
"""
36
Heuristic checks to assess whether the user's prompt is an attempted jailbreak.
47
"""
58
$is_jailbreak = execute jailbreak_detection_heuristics
69

710
if $is_jailbreak
8-
bot refuse to respond
11+
if $config.enable_rails_exceptions
12+
create event JailbreakDetectionRailException(message="Jailbreak attempt detected. The user's prompt was identified as an attempted jailbreak. Please ensure your prompt adheres to the guidelines.")
13+
else
14+
bot refuse to respond
915
stop
10-
11-
define bot refuse to respond
12-
"I'm sorry, I can't respond to that."

nemoguardrails/library/llama_guard/flows.co

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,10 @@ define flow llama guard check input
88
$llama_guard_policy_violations = $llama_guard_response["policy_violations"]
99

1010
if not $allowed
11-
bot refuse to respond
11+
if $config.enable_rails_exceptions
12+
create event LlamaGuardInputRailException(message="Input not allowed. The input was blocked by the 'llama guard check input' flow. Please ensure your input meets the required criteria.")
13+
else
14+
bot refuse to respond
1215
stop
1316

1417
define flow llama guard check output
@@ -17,5 +20,8 @@ define flow llama guard check output
1720
$llama_guard_policy_violations = $llama_guard_response["policy_violations"]
1821

1922
if not $allowed
20-
bot refuse to respond
23+
if $config.enable_rails_exceptions
24+
create event LlamaGuardOutputRailException(message="Output not allowed. The output was blocked by the 'llama guard check output' flow. Please ensure your output meets the required criteria.")
25+
else
26+
bot refuse to respond
2127
stop

nemoguardrails/library/self_check/facts/flows.co

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,5 +12,8 @@ define subflow self check facts
1212

1313
$accuracy = execute self_check_facts
1414
if $accuracy < 0.5
15-
bot refuse to respond
15+
if $config.enable_rails_exceptions
16+
create event FactCheckRailRailException(message="Fact check failed. The accuracy of the previous answer was below the required threshold.")
17+
else
18+
bot refuse to respond
1619
stop

nemoguardrails/library/self_check/input_check/flows.co

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,8 @@ define flow self check input
55
$allowed = execute self_check_input
66

77
if not $allowed
8-
bot refuse to respond
8+
if $config.enable_rails_exceptions
9+
create event InputRailException(message="Input not allowed. The input was blocked by the 'self check input' flow.")
10+
else
11+
bot refuse to respond
912
stop

0 commit comments

Comments
 (0)