Skip to content

Latest commit

 

History

History
81 lines (55 loc) · 2.6 KB

README.md

File metadata and controls

81 lines (55 loc) · 2.6 KB

Awesome Computer Vision Prompting

Description

The repo is used for studying how to use Prompt Engineering for Computer Vision tasks.

Use state-of-art models like Diffusion or other baseline models to generate, inpaint, and paint images.

streamlit run basic_app.py --server.port 5555 --server.enableCORS false

User interaction with Segment Anything Model

streamlit run sam_app.py --server.port 5555 --server.enableCORS false

Inpaint via user-interaction with Diffusion and SAM

streamlit run sam_inpaint_app.py --server.port 5555 --server.enableCORS false

Image assets/van.jpg

Prompts:
  1. [Van] A Volkswagen California van, parked on a beach, with a surfboard on the roof.
  2. [Ground] Big grassland, a lot grass, green or grey grass.
  3. [Between sky and ground] Endless grassland.
  4. [Sky] Clear night sky with stars and full moon.

Few-shot of tracking objects via SAM in video

Use SAM to get mask of the object, use the mask of object to track through all frames.

streamlit run sam_tracker.py --server.port 5556 --server.enableCORS false

Video src: https://dl.dropbox.com/s/0lalmh95tylyw4s/sculpture.mp4

Working comments

I personally think that prompting is a new programming approach. Don’t assume that guiding models with natural language is easy. On the contrary, I believe it’s quite the opposite. Natural language programming lacks the syntax of traditional programming languages, which means there are no type checks or any protective mechanisms in place. If the model (AI) receives an inappropriate prompt, the generated results can be completely different from what was expected.

Here is a prompt I have used the Diffusion model in computer vision. Although it has brought some surprises, it is not actually my ultimate goal.

AI new trend, prompt engineering

Mouse interactive prompt engineering

Install via Docker

# setup
docker build --no-cache --tag cv-prompt-engineering -f Dockerfile .

# run
docker run --gpus all -v /home/ubuntu/work/cv-prompt-engineering/:/workspace/    -p 5555:5555 --rm  -it --shm-size=55gb -d cv-prompt-engineering tail -f /dev/null

Run

streamlit run basic_app.py --server.port 5555 --server.enableCORS false