Azure AI Content Safety

By Ashwin Venugopal - April 17, 2026

How Does Azure AI Content Safety Work?

Azure AI Content Safety is designed to work with both text and images, as well as content generated by AI.

It can identify and moderate inappropriate material. The visual capabilities of Content Safety are driven by Microsoft's Florence foundation model, which has been trained on billions of pairs of text and images.

The analysis of text employs natural language processing methods to enhance the understanding of subtlety and context. Azure AI Content Safety supports multiple languages and is capable of recognizing harmful content in both short and long formats. It is currently available in English, German, Spanish, French, Portuguese, Italian, and Chinese.

Azure AI Content Safety features include:

Safeguarding Text Content

Moderate text scans text across four categories: violence, hate speech, sexual content, and self-harm. A severity level from 0 to 6 is returned for each category. This level helps to prioritize what needs immediate attention by people, and how urgently. You can also create a blocklist to scan for terms specific to your situation.

Prompt shields is a unified API to identify and block jailbreak attacks from inputs to LLMs. It includes both user input and documents. These attacks are prompts to LLMs that attempt to bypass the model's in-built safety features. User prompts are tested to ensure the input to the LLM is safe. Documents are tested to ensure they don't contain unsafe instructions embedded within the text.

Protected material detection checks AI-generated text for protected text such as recipes, copyrighted song lyrics, or other original material.

Groundedness detection protects against inaccurate responses in AI-generated text by LLMs. Public LLMs use data available at the time they were trained. However, data can be introduced after the original training of the model or be built on private data. A grounded response is one where the model’s output is based on the source information. An ungrounded response is one where the model's output varies from the source information. Groundedness detection includes a reasoning option in the API response. This adds a reasoning field that explains any ungroundedness detection. However, reasoning increases processing time and costs.

Safeguarding Image Content

Moderate images scans for inappropriate content across four categories: violence, self-harm, sexual, and hate. A severity level is returned: safe, low, or high. You then set a threshold level of low, medium, or high. The combination of the severity and threshold level determines whether the image is allowed or blocked for each category.

Moderate multimodal content scans both images and text, including text extracted from an image using Optical Character Recognition (OCR). Content is analyzed across four categories: violence, hate speech, sexual content, and self-harm.

Custom Safety Solutions

Custom categories enables you to create your own categories by providing positive and negative examples, and training the model. Content can then be scanned according to your own category definitions.

Safety system message helps you to write effective prompts to guide an AI system's behavior.

Limitations

Azure AI Content Safety employs AI algorithms, which means it might not consistently identify inappropriate language. Occasionally, it may also block acceptable language due to its reliance on algorithms and machine learning for identifying problematic language.

It is essential to test and assess Azure AI Content Safety using real data prior to deployment. After deployment, it is crucial to keep monitoring the system to evaluate its performance accuracy.

Conclusion

We have successfully learnt about AI Content Safety features.

Search This Blog

Blogs by Ashwin

Azure AI Content Safety

Comments

Post a Comment

Popular posts from this blog

Information Protection Scanner: Resolve Issues with Information Protection Scanner Deployment

Azure AI Search plugin in Microsoft Security Copilot (Preview)

Threat Hunting in Microsoft Sentinel (part 1)