The dataset taggings have been fully optimized in V1.1, with distinctions made between season 1 through season 5.
1.0 的标签太过笼统,1.1全面优化了标签,区分了第一季到第五季的标签
This LoRA model is dedicated to reproducing the unique visual style of the classic HBO series, The Wire. It learns the show's specific documentary-like cinematography, low-saturation color palette, and distinctive Baltimore cityscape. The model successfully captures the iconic red brick row houses, dilapidated alleyways, and street scenes filled with a gritty urban atmosphere.
这个 LoRA 模型致力于复现 HBO 经典剧集《火线》(The Wire)的独特视觉风格。它学习了纪实电影感、低饱和度的色彩、独特的街景构图以及光影氛围。模型成功捕捉了巴尔的摩标志性的红砖排屋、破败的小巷和充满街头氛围的场景。
The provided preview images were created using specific prompts that can be a good reference for achieving similar results.(also can refer txt files in training data)
Prompts 可以参考 预览图或者 training data 中的 txt 文档
Challenges in Training this LoRA Model
Dataset Selection: While the story of Baltimore is inseparable from its people, I deliberately avoided major characters to capture the unique narrative feel of The Wire. Therefore, the images with people are mostly background crowds, focusing on the interaction between individuals and the city's architecture to mimic the show's distinctive cinematic style.
Training Resolution: I initially struggled with getting 140 images of consistent style but disparate subjects to converge. My first attempt at 1024 resolution could only learn basic architectural styles. After searching on GitHub, I found a similar case (Flux Lora training seems not to converge with big dataset(140 images) ) where another user needed 4x4x3500 steps to achieve good convergence at 1024 resolution. As a result, I chose to train at 512 resolution and trained for approximately 10,000 steps to achieve the desired results.
这个 LoRA 训练时的一些挑战:
在数据集选取过程中,尽量避免了主要角色,但是这座城市的故事中,人又是不可或缺的,因此选取的带有人物的图像基本都是背景人群图像,人与这座古老城市建筑的互动,才能有效模仿火线特有的镜头叙事感觉
整个训练过程中我遇到了140张离散主题图像难以拟合的问题,最开始训练 1024 分辨率,仅仅能学到基础的建筑风格,查阅了下 GitHub,类似案例(Flux Lora training seems not to converge with big dataset(140 images))中,另外一个人 1024训练分辨率需要 4x4x 3500 steps 才能较好拟合,最后我选择了 512 训练分辨率 训练了大概 10000 steps