Skip to content

Commit

Permalink
recipe: promptable content moderation (#249)
Browse files Browse the repository at this point in the history
* recipe: promptable content moderation

* typo

* readme

* readme

* readme

* readme

* remove unused imports

* black format

* updated readme

* remove hf space logic

* black format
  • Loading branch information
parsakhaz authored Feb 20, 2025
1 parent f102859 commit e64a7c5
Show file tree
Hide file tree
Showing 10 changed files with 2,803 additions and 0 deletions.
52 changes: 52 additions & 0 deletions recipes/promptable-content-moderation/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
*.dll

# Virtual Environment
venv/
env/
ENV/
.venv/

# IDE
.idea/
.vscode/
*.swp
*.swo

# Project specific
inputs/*
outputs/*
!inputs/.gitkeep
!outputs/.gitkeep
inputs/
outputs/

# Model files
*.pth
*.onnx
*.pt

# Logs
*.log

certificate.pem
159 changes: 159 additions & 0 deletions recipes/promptable-content-moderation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Promptable Content Moderation with Moondream

Welcome to the future of content moderation with Moondream 2B, a powerful and lightweight vision-language model that enables detection and moderation of video content using natural language prompts.

[Try it now.](https://huggingface.co/spaces/moondream/content-moderation)

## Features

- Content moderation through natural language prompts
- Multiple visualization styles
- Intelligent scene detection and tracking:
- DeepSORT tracking with scene-aware reset
- Persistent moderation across frames
- Smart tracker reset at scene boundaries
- Optional grid-based detection for improved accuracy on complex scenes
- Frame-by-frame processing with IoU-based merging
- Web-compatible output format
- Test mode (process only first X seconds)
- Advanced moderation analysis with multiple visualization plots

## Examples

| Prompt | Output |
|--------|-----------------|
| "white cigarette" | ![Demo](https://github.com/parsakhaz/promptable-content-moderation/raw/main/examples/clip-cig.gif) |
| "gun" | ![Demo](https://github.com/parsakhaz/promptable-content-moderation/raw/main/examples/clip-gu.gif) |
| "confederate flag" | ![Demo](https://github.com/parsakhaz/promptable-content-moderation/raw/main/examples/clip-conflag.gif) |

## Requirements

### Python Dependencies

For Windows users, before installing other requirements, first install PyTorch with CUDA support:

```bash
pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 --index-url https://download.pytorch.org/whl/cu121
```

Then install the remaining dependencies:

```bash
pip install -r requirements.txt
```

### System Requirements

- FFmpeg (required for video processing)
- libvips (required for image processing)

Installation by platform:

- Ubuntu/Debian: `sudo apt-get install ffmpeg libvips`
- macOS: `brew install ffmpeg libvips`
- Windows:
- Download FFmpeg from [ffmpeg.org](https://ffmpeg.org/download.html)
- Follow [libvips Windows installation guide](https://docs.moondream.ai/quick-start)

## Installation

1. Clone this repository and create a new virtual environment:

```bash
git clone https://github.com/vikhyat/moondream/blob/main/recipes/promptable-video-redaction
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```

2. Install Python dependencies:

```bash
pip install -r requirements.txt
```

3. Install ffmpeg and libvips:
- On Ubuntu/Debian: `sudo apt-get install ffmpeg libvips`
- On macOS: `brew install ffmpeg`
- On Windows: Download from [ffmpeg.org](https://ffmpeg.org/download.html)

> Downloading libvips for Windows requires some additional steps, see [here](https://docs.moondream.ai/quick-start)
## Usage

The easiest way to use this tool is through its web interface, which provides a user-friendly experience for video content moderation.

### Web Interface

1. Start the web interface:

```bash
python app.py
```

2. Open the provided URL in your browser (typically <http://localhost:7860>)

3. Use the interface to:
- Upload your video file
- Specify content to moderate (e.g., "face", "cigarette", "gun")
- Choose redaction style (default: obfuscated-pixel)
- OPTIONAL: Configure advanced settings
- Processing speed/quality
- Grid size for detection
- Test mode for quick validation (default: on, 3 seconds)
- Process the video and download results
- Analyze detection patterns with visualization tools

## Output Files

The tool generates two types of output files in the `outputs` directory:

1. Processed Videos:
- Format: `[style]_[content_type]_[original_filename].mp4`
- Example: `censor_inappropriate_video.mp4`

2. Detection Data:
- Format: `[style]_[content_type]_[original_filename]_detections.json`
- Contains frame-by-frame detection information
- Used for visualization and analysis

## Technical Details

### Scene Detection and Tracking

The tool uses advanced scene detection and object tracking:

1. Scene Detection:
- Powered by PySceneDetect's ContentDetector
- Automatically identifies scene changes in videos
- Configurable detection threshold (default: 30.0)
- Helps maintain tracking accuracy across scene boundaries

2. Object Tracking:
- DeepSORT tracking for consistent object identification
- Automatic tracker reset at scene changes
- Maintains object identity within scenes
- Prevents tracking errors across scene boundaries

3. Integration Benefits:
- More accurate object tracking
- Better handling of scene transitions
- Reduced false positives in tracking
- Improved tracking consistency

## Best Practices

- Use test mode for initial configuration
- Enable grid-based detection for complex scenes
- Choose appropriate redaction style based on content type:
- Censor: Complete content blocking
- Blur styles: Less intrusive moderation
- Bounding Box: Content review and analysis
- Monitor system resources during processing
- Use appropriate processing quality settings based on your needs

## Notes

- Processing time depends on video length, resolution, GPU availability, and chosen settings
- GPU is strongly recommended for faster processing
- Grid-based detection increases accuracy but requires more processing time (each grid cell is processed independently)
- Test mode processes only first X seconds (default: 3 seconds) for quick validation
Loading

0 comments on commit e64a7c5

Please sign in to comment.