Clarity TagFlow
The intelligent, local-first image tagging and AI dataset curation powerhouse — now rewritten in Rust.
download the NEWEST BUILD 5.3.2 at GitHub HERE
new: Lora's thumbnails are now automatically downloaded and FILTERED!
downloader will get an update to support Anima tags
Clarity TagFlow is a modern desktop application designed to streamline the process of tagging images for Machine Learning datasets, Stable Diffusion training, and digital asset management. Built with a strict focus on privacy, speed, and optimization, it runs state-of-the-art AI models entirely on your local machine.
🚀 Project Status: The Rust rewrite is here! The Grand Blueprint became reality — Clarity TagFlow is now a fully native application. No JVM, no 2 GB heap, no garbage collection pauses. Just one small, blazing-fast binary built from ~22,000 lines of pure Rust.
💬 We Need Your Feedback! Do you miss an old feature that hasn't been ported yet? Is there something you don't like, or an improvement you're dying to see? Leave a comment and let us know!
🛠️ System Requirements
OS: Windows (Inno Setup installer), macOS (.dmg, Apple Silicon), Linux (.tar.gz)
Runtime: None! Native binary — no Java, no JVM, nothing to install first
Install size: A few tens of MB (down from 300–400 MB)
Startup: Near-instant, with a skippable animated splash screen
Optional: VLC for video playback (the app runs fine without it and will politely offer an install link)
🌟 Key Features
🤖 Local AI Powerhouse
Privacy First: No images or data are ever uploaded to the cloud. All processing happens 100% offline.
Multi-Model Support: Seamlessly switch between JoyTag, PixAI v0.9, and the WD14 v3 family (ConvNext, SwinV2, Eva02) — now with a built-in Model Manager that downloads models with progress bars and auto-discovers ones you already have.
Smart Thresholding: Fine-tune confidence thresholds to control exactly how strict the AI is when applying tags.
Buttery-Smooth Inference: ONNX Runtime ships with the app and runs at Level-3 graph optimization on a background thread — the UI never stutters while tagging.
🎨 AI Creative Suite (NEW)
In-App Image Generation: The app installs and manages its own ComfyUI backend, downloads GGUF-quantized Flux.1 and Z-Image Turbo models, and gives you full prompt/steps/guidance/seed controls with live logs — zero manual setup.
AI Background Removal: Right-click → Remove Background. BiRefNet computes a saliency matte and saves a transparent PNG next to the original.
Pixal3D — Image to 3D: Turn a single image into a textured GLB 3D model and inspect it in the built-in 3D viewer (orbit, zoom, PBR lighting).
Spatial Scene: An Apple-Photos-style depth parallax effect — Depth Anything V2 estimates depth, and your photo subtly shifts in 3D as you move the mouse.
📷 Pro Image Support (NEW)
Camera RAW: Full pure-Rust develop pipeline for DNG, Sony ARW, Canon CR2, and Nikon NEF — colors matched against the camera's embedded preview.
Radiance HDR with tone mapping, plus PNG, JPEG, GIF, BMP, WebP, ICO, TIFF, AVIF, and HEIC — all decoded in pure Rust, identical on every OS, no codec DLLs.
SD Metadata Done Right: A real parser for A1111, ComfyUI, and Civitai generation data — found in any container (PNG, JPEG, WebP, AVIF), not just PNG text chunks.
⚡ Accelerated Workflow
Batch Processing: Auto-tag entire folders of images in minutes. Choose to Append new tags or Overwrite existing ones completely.
Smart Autocomplete: Type faster with a context-aware autocomplete system that learns from your current dataset and standard tag libraries.
Sidecar Compatibility: Reads and writes standard .txt sidecar files, ensuring seamless compatibility with Kohya_ss, OneTrainer, and other major training tools.
Deep Scan ("Find Issues"): Decode-verify every image to catch corruption, and find exact duplicates via SHA-256 — with one-click cleanup.
Smarter Booru Downloader: The Gelbooru downloader now writes tag-role sidecars (artist/character/copyright/general), with blacklist support, dedup logging, and built-in good-citizen rate limiting.
🎨 Modern Experience
GPU-Rendered UI: The entire interface is drawn on the GPU — scrolling a wall of 10,000 thumbnails is fundamentally smoother than ever before.
Gallery Layout: A gorgeous Pinterest-style masonry grid with lazy thumbnail loading, a click-to-open detail popup, and a floating draggable search pill.
5 Themes, 3 Brand New: Including Space (animated starfield), Aurora (drifting pastel blobs), and Glass (frosted translucent panels).
Visual Feedback: The AI status orb got a full rewrite — a 3D particle sphere that breathes while thinking and morphs through shapes during long jobs.
Polish Everywhere: Color emoji, full CJK font fallback (no more tofu), movable popups that remember their position, HD thumbnails for high-DPI displays, live CPU/RAM graphs, and a crop tool.
🛡️ Control & Safety
Global Blacklist: Automatically filter out unwanted tags across your entire dataset.
Session Memory: Newly added AI tags are highlighted in Cyan, making it incredibly easy to review changes before committing to a save.
Hardened Backups: AES-256 encrypted zips with pre-flight corruption checks, dated filenames, and live progress.
Encrypted Secrets: Civitai API keys, Hugging Face tokens, and rate-limit counters are all stored DPAPI-encrypted on Windows.
Crash Containment: Background workers are panic-isolated — a failure shows a clean error message instead of killing the app.
🚀 What's New & Improved
Native Rust Core: No JVM, no GC pauses, no pre-allocated heap. Typical memory use is a few hundred MB with explicitly bounded caches.
Dual Image Cache: Separate browser and viewer caches with decode-permit gating, so opening a huge image never starves thumbnail loading.
Civitai Integration, Leveled Up: Resolve models, LoRAs, LyCORIS, VAEs, and embeddings by version ID, hash, or name — with preview cards, trigger words, and a live online indicator.
Cross-Platform CI Releases: Every release tag automatically builds a Windows installer, macOS .dmg, and Linux tarball.
Video Quality of Life: Off-thread poster frames, loop playback, pure-Rust MP4/MOV metadata reading, and VLC is now fully optional.
⚠️ Not Yet Ported from the Java Version
We're being honest — a few terminus2 features haven't made the jump yet. They're all on the porting list:
LLM chat / role-play assistant (Ollama / llama.cpp) and text-to-speech
SFTP/FTP remote browsing
Danbooru and Pexels downloaders (Gelbooru is in)
EXIF GPS / Geo location panel
Live folder watching (re-open the folder to refresh for now)
Browsing encrypted archives as a library
Animated WebP playback (animated GIFs work; WebP shows the first frame)
🔮 The Road Ahead
With the Rust foundation in place, the next frontiers from the Grand Blueprint:
Built-in Model & LoRA Training with a live visual training preview.
Native Video Generation alongside image generation.
Zero-Dependency Embedded LLM: Eliminating the need for external tools like Ollama or LM Studio.
In-App Civitai Browsing: Search and download Civitai resources without leaving the app.
Interactive VR Anime Companions: A built-in VR character for interactive roleplaying — dynamically controlled by the AI, expressing real emotions in real time.
External Engine Integration: Broadcasting the embedded LLM to external 3D applications and game engines.
Description
================================================================
Clarity TagFlow — UPDATES & FIXES
Session date: 2026-05-23
================================================================
This file lists the features added and bugs fixed during the
development session. Grouped by area.
----------------------------------------------------------------
LEFT BROWSER PANEL / THUMBNAILS
----------------------------------------------------------------
- GIF badge: GIF tiles now show a "gif.svg" badge in the top-left
corner of the thumbnail (ThumbnailToggleButton).
- Video badge: video tiles show a "video.svg" badge in the
top-left corner. Removed the old centered play-button overlay
and deleted playbutton.svg (badge replaces it).
- Video thumbnails fixed: video previews no longer get stuck on a
blank "loading" tile after deleting an image / refreshing.
* Root cause: thumbnail requests for an already-in-flight
image dropped their callback, so rebuilt tiles never
received the image. Reworked ThumbnailService to COALESCE
callbacks (one decode notifies all waiters) and to stop
clearing in-flight loads on clearCache().
* Also: failed video grabs are no longer cached as blanks
(so they retry), and the video-decode permit wait was
raised so a burst of videos loads instead of timing out.
- Scroll-gap bug fixed: the spacing between tiles stayed a
consistent 16px while scrolling. Previously lazily-appended
chunks used 6px, so images got tighter further down the list.
----------------------------------------------------------------
VIDEO PLAYER (VLC)
----------------------------------------------------------------
- Modern seek bar: replaced the plain slider with a modern
scrubber (thin rounded track, accent-coloured played portion,
round white knob that grows on hover, click-to-seek).
- Play/Pause icons: now uses playbutton.svg / pausebutton.svg
(fixed the "3 dots" that appeared from missing text glyphs).
- Fullscreen: added a full-screen toggle button
(full_screen.svg / close_fullscreen.svg); Esc exits.
- Video Film Strip: when enabled in Settings, the seek bar
becomes a YouTube-style strip of frame thumbnails sampled
across the video, with a playhead and a played-progress bar.
Click/drag the strip to seek. (New VideoFilmStrip component +
VideoThumbnailer.snapshotAt for grabbing frames at a position.)
----------------------------------------------------------------
VIEWER PANEL
----------------------------------------------------------------
- Right-click "Crop Image": right-click an image -> Crop Image ->
drag to select an area -> release to crop. Saves a COPY
(<name>_crop.png) next to the original; the original is never
changed. Crops from the full-resolution original. Esc or
right-click cancels.
----------------------------------------------------------------
RIGHT DETAILS PANEL
----------------------------------------------------------------
- Tags / SD Metadata / Caption switch: a window_switch.svg button
in the top-right of the content box cycles between the views
that exist (only shows when more than one is available).
- Caption support: shows the image's .caption file; the Edit
button edits the .caption when in Caption view.
----------------------------------------------------------------
TAG MANAGER PANEL
----------------------------------------------------------------
- Long captions/tags now WRAP and stay fully visible (switched
the list cell renderer to a wrapping text area; the list tracks
the viewport width). Content scrolls inside the box.
- Fixed a regression where the tag-list box became tiny (a
max-height cap was removed so the box fills the panel again).
----------------------------------------------------------------
AI / LLM
----------------------------------------------------------------
- Captioning vs tagging: the AI now tells the two apart.
* "tag this" -> comma tags saved to <name>.txt
* "caption this" -> a prose description saved to <name>.caption
(New [CAPTION] block in the assistant; captions stored in a
separate .caption sidecar so tags and captions coexist.)
- LLM Right panel (.caption aware): if an image has no .txt but
has a .caption it loads the caption; if it has both it loads
.txt first. A window-switch button next to "LLM Suggestions /
Tags:" toggles between .txt and .caption (only shows when both
exist). Long captions wrap.
- LLM Left panel: thumbnail gallery now uses aspect-correct,
image-hugging tiles (like the main browser) instead of fixed
squares.
- Clear chat: the Clear button now also clears the AI's
conversation memory (both the normal and role-play assistants),
not just the visible messages.
----------------------------------------------------------------
DRAG & DROP + PROJECTS
----------------------------------------------------------------
- Drag & drop import: drop images / GIFs / videos (or folders of
them) anywhere in the app.
* If a folder is open, files are copied into it.
* If no folder is open, a new date-stamped project folder is
created (under data/projects) and opened.
- Live folder updates: the open folder is watched; new/changed
files refresh the browser automatically, and the currently
selected image stays selected (no need to re-click it).
- Projects (saved in the app): auto-created folders live under
data/projects. A "Projects" entry in the folder menu lists them;
selecting one opens it. The currently-open project is
highlighted.
- New project: right-click the folder icon (or "New Project..."
in the menu) to create a named project folder.
- Folder menu header: now shows which folder is currently
selected.
----------------------------------------------------------------
ENCRYPTED FOLDERS (AES-256)
----------------------------------------------------------------
- Startup folder + encryption (Settings > General, top):
* Set a startup folder and "Load this folder when the app
opens".
* "Encrypt..." packs the folder into a password-protected
AES-256 .zip (offers to delete the unencrypted originals).
* On launch, an encrypted startup folder prompts for the
password and mounts it.
- Add to an encrypted folder: dropping files into an open
encrypted archive re-encrypts it (extract -> add -> re-zip ->
re-mount), keeping the current selection; offers to delete the
dropped originals so no plaintext copy remains.
- Delete / Edit tags in an encrypted folder: now work via the
same re-encrypt mechanism (Right Details panel + advanced Tag
Manager).
----------------------------------------------------------------
SETTINGS
----------------------------------------------------------------
- New "Info" tab (next to General): lists what the app can do and
a reference of keyboard & mouse controls. Scrolls inside the tab
so it doesn't make the dialog too tall.
- Fixed: couldn't change Thumbnail Width / Height (and other
spinners). The custom editor wrapper was breaking the spinner's
value binding, so typed values never committed. Now the spinner
paints its own rounded background and keeps the default editor.
- Default thumbnail size changed to Width 230 / Height 400.
----------------------------------------------------------------
CIVITAI INFO
----------------------------------------------------------------
- Added an "API Status" pill in the top-right corner (next to the
Civitai logo), like Danbooru's. Polls continuously every 5s and
shows Checking / Online / Offline.
----------------------------------------------------------------
NEW FILES ADDED THIS SESSION
----------------------------------------------------------------
- InfoPanel.java (Settings "Info" tab)
- MediaImporter.java (drag & drop import)
- FolderWatcher.java (live folder updates)
- ProjectStore.java (projects under data/projects)
- FolderEncryptor.java (AES-256 folder encryption)
- Captions.java (.caption sidecar read/write)
- ModernSliderUI.java (modern video scrubber)
- VideoFilmStrip.java (video frame-strip scrubber)
- Captions / caption support across panels
(Icons used: gif.svg, video.svg, playbutton.svg, pausebutton.svg,
full_screen.svg, close_fullscreen.svg, window_switch.svg)
----------------------------------------------------------------
BUNDLED VLC (libVLC) + LICENSE NOTE
----------------------------------------------------------------
- The video player now bundles libVLC so users don't need to
install VLC:
src/main/resources/tools/vlc/
libvlc.dll
libvlccore.dll
plugins/ (~133 MB)
Loaded via the new VlcBundle.configure(), which points vlcj/JNA
at this folder before NativeDiscovery runs. Falls back to an
installed VLC if the bundle isn't present.
Note: native DLLs can't be loaded from inside a packaged .jar —
this works when running from the IDE / exploded classes; for a
fat-jar build, the tools/vlc folder must be extracted to disk at
startup first.
- LICENSE (important when distributing):
This app bundles and uses libVLC and VLC plugins from the
VLC media player project (c) the VideoLAN organization.
* libVLC is licensed under LGPL-2.1+.
* Some bundled VLC plugins are licensed under the GPL.
* The Java binding is vlcj (c) Caprica Software (LGPL/GPL).
When distributing the application you must include these license
notices. A summary was added to the Settings > Licenses tab, and
the full license / attribution texts now ship in the bundle folder:
src/main/resources/tools/vlc/
COPYING.txt (GNU GPL, as shipped with VLC)
AUTHORS.txt (VLC authors / contributors)
THANKS.txt (acknowledgements)
README.txt (VLC readme)
NOTICE.txt (our LGPL/GPL split + reference links)
References:
https://www.videolan.org/legal.html
https://www.gnu.org/licenses/lgpl-2.1.html
https://github.com/caprica/vlcj
================================================================
SESSION 2 — MORE FIXES & FEATURES
================================================================
----------------------------------------------------------------
AI ORB (replaces the static AI.svg everywhere)
----------------------------------------------------------------
- New AiOrb component: a "living" ring of glowing particles that
gently breathes when idle and speeds up / brightens / ripples
when the assistant is THINKING or TALKING. A lightweight Swing
stand-in for a WebGL particle ring (no 3D, same feel).
* One shared 60fps ticker drives every visible orb (cheap even
with many on screen, e.g. one per chat message); off-screen
orbs are skipped and the ticker stops when none are alive.
* Theme-accent coloured; states IDLE / THINKING / TALKING.
- Replaced the AI.svg icon in all 5 spots: LLM chat assistant
avatar (thinks pre-stream, "talks" while streaming), LLM tagging
panel header, Tag Manager header, Generate controls header, and
the image-viewer header (the two generation headers tint the orb
to the live status colour).
- Orb size tuned (36px) and the four panel headers were pinned to a
fixed height so the orb sits inside them without resizing them.
----------------------------------------------------------------
TEXT-TO-SPEECH (smoother + faster to start)
----------------------------------------------------------------
- Fixed the squeaky/"chipmunk" voice: removed the resampling pitch
shift (was +18%, which moved the formants up). Pitch is now 1.0
(natural timbre); voice is a warm blend (af_heart + a little
af_bella) at a calmer 0.96 speed.
- Much lower start-up delay: speech now STREAMS by sentence — the
first sentence plays while the next is still being synthesised,
instead of waiting for the whole reply to render. The model also
warms up the moment TTS is enabled, so the first line is prompt.
- Clean cancellation across the new streaming pipeline (interrupting
or starting a new line cuts off immediately, no orphaned temp WAVs).
----------------------------------------------------------------
TAG MANAGER — NEW MODEL: PixAI Tagger v0.9
----------------------------------------------------------------
- Added PixAI Tagger v0.9 (EVA02, ~13.4k tags, great character /
series recognition) to "Get Models" and the model dropdown
(Tag Manager + LLM panel). Uses the DeepGHS ONNX export.
* New PixaiTagger class with the model's exact preprocessing
(448x448, RGB, mean/std 0.5) and per-category thresholds
(general from the spinner, character held to >= 0.85).
* Fixed initially-wrong tags: the ONNX has 3 outputs
(embedding / logits / prediction); we now read 'prediction'
by NAME instead of index 0 (which was the embedding).
- Model dropdown now lists ONLY downloaded models (plus the
"Select AI..." placeholder), and refreshes after Get Models
closes — no more phantom models implying they're all installed.
Applied to both the Tag Manager and the LLM panel.
----------------------------------------------------------------
MODELS / DATA STORAGE (everything under src/main/resources)
----------------------------------------------------------------
- AI tagger models downloaded via "Get Models" now always save into
src/main/resources/tools/<model>/ (the old code fell back to a
project-root tools/ folder on first download).
- App runtime data (settings, projects, hearts, memory, SFTP host
key, emoji cache, Pexels logs) now lives under src/main/resources
/data and /logs, consistent with config/, logs/ and tools/.
- pom.xml excludes the large downloadable tagger models
(tools/joytag, tools/wd14-*, tools/pixai-*) from the resource copy
so they don't bloat builds (users download them at runtime).
----------------------------------------------------------------
GENERATE vs LOCAL LLM (mutually exclusive)
----------------------------------------------------------------
- Generate (Stable Diffusion) and the Local LLM can't run at once.
Enabling either while the other is on now shows a "turn the other
off first" message and reverts the toggle — guarded in BOTH
directions (Generate tab and Local LLM tab).
----------------------------------------------------------------
CIVITAI INFO
----------------------------------------------------------------
- Hearting an image no longer refreshes the Civitai Info panel
(it was re-hitting the Civitai API on every heart). The favorite
indicator still updates live via HeartManager's change listener.
----------------------------------------------------------------
UI POLISH — SMOOTH ROUNDED CORNERS (no more setShape)
----------------------------------------------------------------
- Replaced hard-clipped setShape() rounding (which looked blurry /
jagged on the right & bottom) with antialiased rounded panels on
transparent windows across the app:
* Tag Manager settings dialog (also fixed a border that looked
"cut off" on the right/bottom, an off-by-one in RoundedBorder).
* Backup dialogs (progress + New Backup).
* Folder picker, Embedding / LoRA selection dialogs, tooltips,
and the LLM emoji / tools popups.
* The folder menu + its Projects submenu are now rounded AND
smooth (custom rounded popup border + transparent popup window).
There is no longer any setShape() rounding anywhere in the app.
- Tag Manager settings: the Tag Separator dropdown and Default
Confidence Threshold now sit on the LEFT next to their labels
(instead of pushed to the far right). Same for the LLM panel's
Threshold control.


