Technical Spec
Fish City Visual Catalog – Catalog v1 Specification
1. Purpose
Catalog v1 defines the minimal integration between the Jon Sarkin Omeka-S catalog (https://catalog.jonsarkin.com) and the Qdrant vector database on hyphae. The goal is to enable GPU-backed visual similarity search over artworks using only:
- One image per Omeka item
- Title and identifier metadata
2. Scope
- Omeka is the canonical data store.
- Qdrant holds CLIP image embeddings plus minimal metadata.
- No inferred themes, OCR, or rich text metadata (reserved for Catalog v2+).
3. Components
- Omeka-S (Artwork resource template, class
VisualArtwork). - Qdrant – collection
omeka_items. - omeka-sync job – periodically syncs Omeka → Qdrant.
4. Omeka Requirements
Each item included in Catalog v1 MUST satisfy:
- Resource template: Artwork (Jon Sarkin) (class
VisualArtwork). - Fields:
- Title (string)
- At least one image media marked as the primary representation.
5. Qdrant Collection Schema
Collection name: omeka_items
- Point ID:
omeka_item_id(integer, from Omeka API). - Vectors:
visual_vec– CLIP image embedding, 512 dimensions, cosine distance.
- Payload fields (JSON object):
omeka_item_id(int)title(string)omeka_url(string; public item URL)thumb_url(string; chosen thumbnail/derivative URL)catalog_version(int; constant1in this spec)
Additional experimental fields (e.g. year, subjects, ocr_text, text_blob, dominant_color, curator_notes) MAY be present in payloads but are not part of the Catalog v1 contract.
Example payload
{
"omeka_item_id": 1234,
"title": "Untitled blue head",
"omeka_url": "https://catalog.jonsarkin.com/s/item1234",
"thumb_url": "https://catalog.jonsarkin.com/files/large/1234.jpg",
"catalog_version": 1
}
6. Sync Job Behaviour (omeka-sync)
- Fetch all Omeka items using the Artwork template.
- For each item:
- Resolve
omeka_item_id,Title, main image URL. - Skip items with no valid image.
- Download image, run through CLIP encoder → 512-dim
visual_vec. - Upsert into Qdrant:
point_id = omeka_item_id- Set
visual_vecand payload as above.
- Resolve
- Optionally: mark deleted/hidden Omeka items as soft-deleted.
Design Decisions: Omeka–Qdrant Integration
This section records the current architectural decisions for the Fish City visual catalog integration between Omeka-S and Qdrant.
ADR-001: Point identity and versioning
- Decision: The Qdrant point ID for collection
omeka_itemsis always the Omeka item ID.- Qdrant
id= Omeka internal item ID (e.g.1810). - Payload duplicates this as
omeka_item_id(oritem_id), for readability and future migrations.
- Qdrant
- Decision: URLs stored in Qdrant are opaque and owned by Omeka.
omeka_urlandthumb_urlare treated purely as references; Qdrant does not host images.- Any changes to URL structure are handled by re-syncing from Omeka, not by editing Qdrant directly.
- Decision:
catalog_versionis monotone.- Version values:
1for Catalog v1 (image-only),2for Catalog v2 (human metadata + text embeddings), higher integers for future versions. - Items may move from version
1 → 2 → 3, but are never downgraded to a lower version number.
- Version values:
Clients SHOULD rely only on the fields listed in the Catalog v1/v2 specifications; other payload keys are considered internal and may change.
These decisions ensure that Qdrant remains a derived, replaceable index whose records can always be regenerated from Omeka-S as the canonical source.
Fish City Visual Catalog – Catalog v2 Specification
1. Purpose
Catalog v2 extends Catalog v1 by incorporating a small, stable set of human-entered metadata fields from Omeka-S and adding CLIP text embeddings. The goal is to support semantic text queries (e.g., “blue abstract on cardboard”) and simple faceted filters, while keeping metadata requirements realistic for day-to-day cataloging.
2. Scope
- Omeka remains the canonical record.
- Qdrant stores:
- Image embeddings (
visual_vec) - Text embeddings (
text_vec_clip) - A limited, normalized subset of Omeka metadata.
- Image embeddings (
- No LLM-generated themes or OCR yet (reserved for Catalog v3).
3. Additional Omeka Requirements
For items included in Catalog v2, the following fields SHOULD be populated where known:
- Description (free text)
- Subject (keywords / concepts)
- Medium or artMedium
- dateCreated (at least year)
- height and width (with a consistent unit, e.g., cm)
4. Qdrant Collection Schema Changes
Same collection: omeka_items. Existing points from v1 are upgraded in-place.
- Vectors:
visual_vec– unchanged.text_vec_clip– CLIP text embedding, 512 dimensions, cosine distance.
- Payload additions:
year(int, derived fromdateCreatedwhere possible)subjects(array of strings; Omeka Subject labels)mediums(array of strings; from Medium / artMedium)dimensions_cm(object, e.g.{"height": 30.5, "width": 22.9})catalog_version(int; set to2once enriched)
Example payload (v2)
{
"omeka_item_id": 1234,
"inventory_id": "JS-2025-001",
"title": "Untitled blue head",
"omeka_url": "https://catalog.jonsarkin.com/s/item1234",
"thumb_url": "https://catalog.jonsarkin.com/files/large/1234.jpg",
"year": 1994,
"subjects": ["portrait", "blue", "abstract"],
"mediums": ["acrylic", "cardboard"],
"dimensions_cm": {"height": 30.5, "width": 22.9},
"catalog_version": 2
}
5. Text Embedding Construction
For each item, text_vec_clip is computed from a single concatenated string:
TEXT_INPUT =
Title + ". " +
Description + " " +
"Subjects: " + Subject-list + ". " +
"Medium: " + Medium-list + ". " +
"Year: " + Year
6. Sync / Upgrade Behaviour
- Run the Catalog v1 sync if needed (ensuring all items exist in Qdrant).
- For each eligible item:
- Fetch the extended metadata fields from Omeka.
- Normalize year and dimensions into the v2 payload format.
- Build
TEXT_INPUTas above and encode via CLIP text encoder. - Upsert into Qdrant:
- Update payload with new fields and set
catalog_version = 2. - Set
text_vec_clipfor that point.
- Update payload with new fields and set
7. Query Semantics
- Text query: encode user text with CLIP text encoder, search
text_vec_clip(cosine) inomeka_items. - Image query: unchanged from Catalog v1 (search
visual_vec). - Optional filters:
- By
yearor year ranges - By
mediums(exact-match string filters) - By
subjects(exact-match string filters)
- By
Catalog v2: CLIP Search API
Catalog v2 introduces a GPU-backed CLIP search sidecar (clip-api) that provides read-only semantic search over the Omeka-S catalog using the Qdrant omeka_items collection.
Base URL
All endpoints are served from the internal CLIP API service and reverse-proxied into the Omeka site:
BASE_URL = https://catalog.jonsarkin.com/clip-api
Authentication
The API is read-only and intended to be consumed by Omeka theme/plugins and internal tools. Public clients should call Omeka endpoints that wrap this API, not clip-api directly.
GET /healthz
Lightweight health check for the CLIP API and Qdrant.
GET ${BASE_URL}/healthz
Response 200 OK
{
"status": "ok",
"qdrant": "ok",
"model": "ViT-B-32 laion2b_s34b_b79k"
}
POST /search/text
Semantic text search over artworks. The query string is encoded with the CLIP text encoder and searched against the visual_vec field in omeka_items.
POST ${BASE_URL}/search/text
Content-Type: application/json
Request body
{
"query": "blue abstract",
"limit": 20,
"filters": {
"year": { "gte": 1990, "lte": 2025 },
"subjects": ["cactus", "portrait"]
}
}
query(string, required): free-form text.limit(int, optional, default 20, max 100).filters(object, optional): maps directly to Qdrant payload filters over fields such asyear,subjects,collection, etc.
Response 200 OK
{
"results": [
{
"omeka_item_id": 71,
"score": 0.923,
"payload": {
"title": "Super artist 33",
"thumb_url": "https://catalog.jonsarkin.com/files/original/...",
"collection": "omeka",
"year": 2025
}
}
]
}
score: similarity score in [0, 1], higher = more similar.payloadis a subset of the Qdrant payload for convenience; Omeka remains the canonical source of metadata.
POST /search/similar
“More like this” search using an existing artwork as the anchor. Uses Qdrant recommend on visual_vec.
POST ${BASE_URL}/search/similar
Content-Type: application/json
Request body
{
"omeka_item_id": 71,
"limit": 20,
"filters": {
"collection":