fluid/stable/: fluid-vision-1.0.3 metadata and description

Contains utilities for computer vision tasks

author	Ameya Kirtane
author_email	ameya.kirtane@fluidanalytics.ai
classifiers	Programming Language :: Python :: 3 Programming Language :: Python :: 3.10 Programming Language :: Python :: 3.11 Programming Language :: Python :: 3.12 Programming Language :: Python :: 3.13
description_content_type	text/markdown
metadata_version	2.4
requires_dist	boto3 (>=1.42.36,<2.0.0) google-genai (>=1.60.0,<2.0.0) imagehash (>=4.3.2,<5.0.0) opencv-python (>=4.13.0.90,<5.0.0.0) pillow (>=12.1.0,<13.0.0) pydantic (>=2.12.5,<3.0.0) python-dotenv (>=1.2.1,<2.0.0) torch (>=2.10.0,<3.0.0) ; extra == "embeddings" torchvision (>=0.25.0,<0.26.0) ; extra == "embeddings" tqdm (>=4.67.1,<5.0.0)
requires_python	>=3.10, <3.14

Because this project isn't in the mirror_whitelist, no releases from root/pypi are included.

File	Tox results	History
fluid_vision-1.0.3-py3-none-any.whl Size 9 KB Type Python Wheel Python 3		Uploaded to fluid/stable by fluid 2026-02-09 08:54:42
fluid_vision-1.0.3.tar.gz Size 7 KB Type Source		Uploaded to fluid/stable by fluid 2026-02-09 08:54:42

Fluid Vision

Directory Structure

fluid_vision/
├── .coverage
├── poetry.lock
├── pyproject.toml
├── usage.txt
├── notebooks/
├── src/
│   └── fluid_vision/
│       ├── __init__.py
│       ├── data/
│       │   ├── __init__.py
│       │   └── frame_extractor/
│       │       ├── __init__.py
│       │       └── frame_extractor.py
│       ├── embeddings/
│       │   └── frame_embedder.py
│       ├── encoder/
│       │   └── encoder.py
│       ├── models/
│       │   ├── __init__.py
│       │   ├── embedding_models_enum.py
│       │   └── model_loader/
│       │       └── model_loader.py
│       └── utils/
│           ├── cropping.py
│           └── load_files.py
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── test_cropping.py
│   ├── test_dino_embedding.py
│   ├── test_embedding_large.py
│   ├── test_embeddings_base.py
│   ├── test_encoding.py
│   ├── test_frame_extraction_phash.py
│   └── test_frame_fps.py
└── .pytest_cache/
    ├── .gitignore
    ├── CACHEDIR.TAG
    ├── README.md
    └── v/
        └── cache/
            └── nodeids

Module Usage

1. Frame Extraction

Extract frames at a target FPS:

from fluid_vision.data.frame_extractor.frame_extractor import FrameExtractor

extractor = FrameExtractor(target_fps=5)
result = extractor.extract_frames_fps("path/to/video.mp4")
frames_dict = result["frames_dict"]  # {frame_number: PIL.Image}

Extract frames by perceptual hash (skip duplicates):

extractor = FrameExtractor(frame_skip=10)
result = extractor.extract_frames_by_hash("path/to/video.mp4")
frames_dict = result["frames_dict"]

Extract frames with cropping:

extractor = FrameExtractor(target_fps=5, cropping={"quadrant": 2}, skip_similar=0.5)
result = extractor.extract_frames_fps("path/to/video.mp4")
cropped_frames = result["frames_dict"]

# Optionally filter frames based on average hash function
# By default all duplicate frames are skipped
filtered_frames_dict = extractor.filter_frames_by_hash(frames_dict)["filtered_frames_dict"]

2. Loading Images and Pickle Files

Load a dictionary from a pickle file:

from fluid_vision.utils.load_files import load_dict_from_pickle

data = load_dict_from_pickle("path/to/file.pkl")

Load all images from a folder:

from fluid_vision.utils.load_files import load_image_list

images = load_image_list("path/to/folder")

3. Embedding Models

Load a model and processor:

from fluid_vision.models.model_loader.model_loader import ModelLoader

model_loader = ModelLoader(model_name="openai/clip-vit-base-patch16")
model, processor = model_loader.load_model()

4. Frame Embedding

Get embedding for a single frame:

from fluid_vision.embeddings.frame_embedder import FrameEmbedder

embedder = FrameEmbedder(model_name="openai/clip-vit-base-patch16", model=model, processor=processor, device="cpu")
embedding = embedder.get_frame_embedding(frame)  # frame is a PIL.Image

Get embeddings for a batch of frames:

result = embedder.get_batch_embeddings([frame1, frame2, ...])
embeddings = result["embeddings"]  # numpy.ndarray

5. Encoding Embeddings as Images

Create an encoding matrix image from embeddings:

from fluid_vision.encoder.encoder import EmbeddingEncoder

encoder = EmbeddingEncoder()
result = encoder.create_encoding_matrix(embeddings)  # embeddings: np.ndarray
encoding_image = result["encoding_image"]  # numpy.ndarray (image)

Build and Install with Poetry

Install Poetry (if not already installed):
```
pip install poetry
```
Install dependencies:
```
poetry install
```
Build the package:
```
poetry build
```
Run tests:
```
poetry run pytest
```
Activate the virtual environment:
```
poetry shell
```

Remote Installation via devpi server:

Add this to your pip config to enable installation from devpi server

global.extra-index-url='https://pypi.org/simple'
global.index-url='https://devpi.fluidanalytics.ai/fluid_vision/dev/+simple/'
global.trusted-host='devpi.fluidanalytics.ai'