METADATA GALLERY
AI-Powered Image Gallery with Vision Metadata Extraction
The Problem Statement
“Image galleries traditionally store files as static assets with manually entered metadata — a bottleneck that breaks at any meaningful scale and produces inconsistent, incomplete tagging. MetaData Gallery solves this by wiring OpenAI's GPT-4 Vision API directly into the upload pipeline, automatically extracting captions, semantic tags, detected objects, and dominant colour palettes from every uploaded image. It acts as the intelligent middle ground between a raw file store and a fully featured digital asset management system, with GridFS ensuring blob storage at scale.”
The Architecture Layout
The backend is a Node.js Express TypeScript server. Multer handles multipart uploads and streams binary data into MongoDB GridFS via the Mongoose GridFSBucket API. After storage, the server base64-encodes a thumbnail and sends it to the OpenAI GPT-4 Vision endpoint with a structured metadata extraction prompt. The response is normalised into a metadata document (caption, tags array, objects array, hex colour palette) and persisted alongside the GridFS file reference. The React 18 TypeScript frontend uses Ant Design components and React Query for all server state — uploads trigger optimistic gallery additions while the metadata is being processed, and the gallery refreshes automatically on query invalidation. Vitest covers frontend component rendering; Jest and Supertest cover backend route and service logic.
Architecture Design Diagram
