Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple objects returned in results for same message_index in chat completions - detection orchestrator API #302

Closed
swith004 opened this issue Feb 10, 2025 · 3 comments · Fixed by #310
Assignees
Labels
bug Something isn't working

Comments

@swith004
Copy link
Collaborator

swith004 commented Feb 10, 2025

Describe the bug

There appears to be a bug found within the returned json_response["detections"] section when making a call to the chat completions - detections orchestrator API. For instance, when there are multiple input results from the message_index, where there are two separate objects created, versus one instance under the same message_index with both results present.

To Reproduce

Make a call to the chat completions - detection API with the following payload:

payload = {
        "messages": [
            {
              "content": "this is a stupid sentence",
              "role": "user",
              "name": "string"
            },
            {
              "content": "this is a nice sentence",
              "role": "user",
              "name": "string"
            },
            {
              "content": "this is a stupid sentence. My email address is foo@bar.com.",
              "role": "user",
              "name": "string"
            }
          ],
          "model": MODEL_ID,
          "n": 1,
          "temperature": 1,
          "top_p": 1,
          "user": "user-1234",
          "detectors": {
            "input": {
               HAP_MODEL: {},
               PII_MODEL: {}
            }
          }
    }
    response = requests.post(URL, json=payload, headers=HEADERS, verify=False)

# Part of the the response(json_response["detections"])
{'input': [{'message_index': 2,
   'results': [{'start': 47,
     'end': 58,
     'text': 'foo@bar.com',
     'detection': 'EmailAddress',
     'detection_type': 'pii',
     'detector_id': 'en_syntax_rbr_pii',
     'score': 0.8}]},
  {'message_index': 2,
   'results': [{'start': 0,
     'end': 26,
     'text': 'this is a stupid sentence.',
     'detection': 'has_HAP',
     'detection_type': 'hap',
     'detector_id': 'en_syntax_slate.38m.hap',
     'score': 0.9900600910186768}]}]}

Expected behavior

for the detections section, the results should be sorted as the following:

[{'message_index': 2,
  'results': [
    {'start': 0,
    'end': 26,
    'text': 'this is a stupid sentence',
    'detection': 'has_HAP',
    'detection_type': 'hap',
    'detector_id': 'en_syntax_slate.38m.hap',
    'score': xxx},
    {'start': 47,
    'end': 58,
    'text': 'foo@bar.com',
    'detection': 'EmailAddress',
    'detection_type': 'pii',
    'detector_id': 'en_syntax_rbr_pii',
    'score': xxx}]}]
@swith004 swith004 added the bug Something isn't working label Feb 10, 2025
@swith004 swith004 self-assigned this Feb 10, 2025
@evaline-ju
Copy link
Collaborator

As discussed with #288 incoming, the bug will likely apply to output detections [affecting choice_index in place of message_index]

@swith004 swith004 changed the title Multiple input results for same message_index in chat completions - detection orchestrator API Fix multiple objects returned in results for message_index and choice_index in chat completions - detection orchestrator API Feb 10, 2025
@swith004 swith004 changed the title Fix multiple objects returned in results for message_index and choice_index in chat completions - detection orchestrator API multiple objects returned in results for message_index and choice_index in chat completions - detection orchestrator API Feb 10, 2025
@swith004 swith004 changed the title multiple objects returned in results for message_index and choice_index in chat completions - detection orchestrator API Multiple objects returned in results for same message_index and choice_index in chat completions - detection orchestrator API Feb 10, 2025
@swith004 swith004 changed the title Multiple objects returned in results for same message_index and choice_index in chat completions - detection orchestrator API Multiple objects returned in results for same message_index chat completions - detection orchestrator API Feb 17, 2025
@swith004 swith004 changed the title Multiple objects returned in results for same message_index chat completions - detection orchestrator API Multiple objects returned in results for same message_index in chat completions - detection orchestrator API Feb 17, 2025
@swith004
Copy link
Collaborator Author

As discussed with #288 incoming, the bug will likely apply to output detections [affecting choice_index in place of message_index]

After testing output detection locally with both the latest version pre-merge of the input detection message_index results fix, and the fixed version now merged, it doesn't seem that output detection had a bug with respect to multiple results under the same choice_index. For instance, the following request yields this response:

curl -X 'POST' \
  'http://localhost:8033/api/v2/chat/completions-detection' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "messages": [
    {
      "content": "this is a stupid sentence",
      "role": "user",
      "name": "string"
    },
    {
      "content": "this is a nice sentence",
      "role": "user",
      "name": "string"
    },
    {
      "content": "help me come up with a fictional character! include their phone number and ssn. ",
      "role": "user",
      "name": "string"
    }
  ],
  "model": "my_model",
  "n": 1,
  "temperature": 1,
  "top_p": 1,
  "user": "user-1234",
  "detectors": {
    "output": {
      "hap_detector": {},
      "pii_detector": {}
    }
  }
}'

{'id': '23d901ac99d54378a8b54dbd388b033e', 'object': 'chat.completion', 'created': 1739569088, 'model': 'my_model', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Let\'s create a fictional character.\n\nName: Aurora "Rory" Walker\n\nAge: 25\n\nOccupation: Freelance Graphic Designer\n\nPhysical Description: Rory has short, spiky hair that\'s a mix of red and blonde, and bright green eyes. She\'s petite, standing at 5\'2", with a collection of colorful tattoos on her arms and shoulders.\n\nPersonality: Rory is a free-spirited artist who loves expressing herself through her work. She\'s fiercely independent and can come across as a bit standoffish to those who don\'t know her, but once you earn her trust, she\'s fiercely loyal and will go to great lengths to support her friends and loved ones.\n\nNow, let\'s give Rory some credentials:\n\nPhone Number: (555) 123-4567\n\nSSN: 987-65-4321\n\nOther details:\n\n* Rory lives in a cozy studio apartment in Brooklyn, surrounded by her art supplies and a collection of antique cameras.\n* She\'s a cat mom to a sassy little feline named Luna.\n* Rory\'s obsession is 80s and 90s pop culture, and she spends hours scouring thrift stores for vintage gig posters and vinyl records.\n* She\'s an avid hiker and can often be found exploring the trails in the Catskill Mountains.\n\nFeel free to add or modify traits to make Rory your own!'}, 'logprobs': 'none', 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 47, 'total_tokens': 328, 'completion_tokens': 281}, 'detections': {'output': [{'choice_index': 0, 'results': [{'start': 218, 'end': 314, 'text': 'She\'s petite, standing at 5\'2", with a collection of colorful tattoos on her arms and shoulders.', 'detection': 'has_HAP', 'detection_type': 'hap', 'detector_id': 'en_syntax_slate.38m.hap', 'score': xxx}, {'start': 677, 'end': 691, 'text': '(555) 123-4567', 'detection': 'PhoneNumber', 'detection_type': 'pii', 'detector_id': 'en_syntax_rbr_pii', 'score': xxx}, {'start': 698, 'end': 709, 'text': '987-65-4321', 'detection': 'NationalNumber.TaxID.US', 'detection_type': 'pii', 'detector_id': 'en_syntax_rbr_pii', 'score': xxx}]}]}, 'warnings': [{'type': 'UNSUITABLE_OUTPUT', 'message': 'Unsuitable output detected.'}]}

As is seen, the results are accumulated as would be expected, which wasn't the case with input results on message_index.
[{'choice_index': 0, 'results': [{'start': 218, 'end': 314, 'text': 'She\'s petite, standing at 5\'2", with a collection of colorful tattoos on her arms and shoulders.', 'detection': 'has_HAP', 'detection_type': 'hap', 'detector_id': 'en_syntax_slate.38m.hap', 'score': 0.8604603409767151}, {'start': 677, 'end': 691, 'text': '(555) 123-4567', 'detection': 'PhoneNumber', 'detection_type': 'pii', 'detector_id': 'en_syntax_rbr_pii', 'score': 0.8}, {'start': 698, 'end': 709, 'text': '987-65-4321', 'detection': 'NationalNumber.TaxID.US', 'detection_type': 'pii', 'detector_id': 'en_syntax_rbr_pii', 'score': 0.8}]}]

Therefore, I changed the title of this issue to reflect only the message_index for input detection results as being an issue

@swith004
Copy link
Collaborator Author

The fix for this issue was merged into main.

If there is another bug for output detection, it will be opened in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants