Anime Consistency v1 – e470 – PDXL

The plan here is simple; I want something as consistent as Novel AI on my home pc and I want to share it with others. Not to hurt Novel AI, but to provide a supplement for the informational generation provided from it, and to introduce a series of new low-cost training potentials to the generational process for all users.

~~Goal 1; Introduce a series of core concepts meant to “fix” many simple topics that simply do not work.~~
Goal 2; Reusable LOHA to simply activate on any core model of the user’s wish that functions with already existing LORA and LOHA as a supplement.
Goal 3; Provide a simple merge of autism and this LOHA intentionally meant to reduce training time and provide consistency for less steps to other models.
Goal 4; Provide a very simple process for folder organization and training data to bring orderly training to any of the PDXL based models.

The majority of models I’ve seen are FAIRLY consistent, until you hit a complexity too high for the AI to understand.

So I asked myself the simple question; how exactly do we create order from chaos?

After a series of small scale tests using a simple folder pattern and specific target images meant for specific controllers, I scaled up the test into my first PDXL consistency generation LOHA and the outcome was more than interesting.

This was genuinely not meant to generate NSFW themes, but it most definitely will provide NSFW themes. This is intentionally built in a way that allows both over sexualization, and under sexualization due to positive and negative prompting from the user. This will be more annunciated in future versions, but this version leans towards NSFW so be aware when using the core folder tags, but also be aware that this is conceptually built to provide substance to the core model which is already very NSFW oriented.

Initial research;

Accordingly, the majority of articles and papers I’ve researched implied LOHA folder structure and internal tagging is crucial. Everything MUST be orderly and correctly tagged with specifics otherwise you get bleed-over.
Naturally, I ignored the opinion aspect of everyone else’s words due to my own findings on small scale. The information does not correlate with reality, and thus I had to find a hypothesis that melded their opinions with the actual generational data’s outcome, and I believe I found such a middle-ground here.
My initial generational system is one that intentionally bleeds LOHA concept to LOHA concept and is meant to saturate concepts contained within PDXL itself.
Layered bodies, layered topics, layered concepts, layered poses, and layered coloration.
This LOHA experimentation’s goal is to bring order to chaos on a minimum scale. The images uploaded, information provided, and the outcome should speak for itself. Try it yourself.

I created, generalized imposing concepts based on 1girl using interpolation between Novel AI and PDXL. Thus, the mannequin concept was born.

Concept 1; Backgrounds are too chaotic when using simple prompts.
1. Hypothesis; The backgrounds themselves are causing serious image consistency damage.
2. Process; All images in the dataset must contain simple backgrounds, but not be tagged as simple backgrounds.
  - By default I tagged all images simple background, grey background or gradient blue/white that seemed appealing to me based on the image theme itself. The theory here was simple, remove the clutter.
  - I took every source image from my mannequin library and removed any backgrounds if there were any backgrounds as I regenerated the images using NovelAI for the anime style consistency.
3. Analyzed Result; The outcome yields a considerably more consistent and pose-strong form of the human body using the default tags.
  1. Success; The simple background can be reinforced using the simple background tag, or introduce more complicated backgrounds by defining scenes directly.
  2. Original;
  3. Outcome;
  4. Drawbacks;
    1. Scene defining is a bit more of a chore, but background imposition using img2img is arbitrary, so a simple series of backgrounds to play with can provide context to anything that you need.
    2. Harder to implicate more complex concepts; more training data required for more complex interactions in more complex scenes.
Concept 2; Pose and concept tags are bled together too much by default.
1. Hypothesis; By burning new implications into the base pose pool, the outcome provided will be more consistent.
2. Process; Small scale tests showed this to be highly reliable when it comes to using the replaced pose tags vs attempting to use the more complex mixture.
  1. Step 1, generate images using NAI.
  2. Step 2, generate interpolated images between PDXL Autism and NAI’s generated image stock.
  3. Step 3, inpaint the core differences and find a middle-ground.
3. Analyzed Result;
  - Success; The outcome was phenomenal. Everything based on this theory was sound, and the simplistic information including the specific poses with generic tags supplemented the original, burning the unnecessary details and introducing the necessary tested details that I imposed into the system.
  - Original;
  - Outcome;
  - Drawbacks;
    - multiple girls – There’s some generalized inconsistencies with 2girls, 3girls, 4girls, and so on. They work, somewhat, and they can be shifted, posed, and so on, but they are more unreliable with the lora ON than OFF due to the lack of actual training data for multiple girls.
    - boys – Due to the nature of layering and the complete lack of any sort of male imposition, it quite literally just wants girls unless you juice up the tags. You CAN generate boys with the girls if you want, but they aren’t going to be anywhere near as consistent for v1. This is a female consistency generator, not male, so keep that in mind.
    - It produces pose artifacts and it will until the necessary information is provided. There are MANY poses, and those combinations of those poses have to be hand crafted to be specific to those particular generalized tag usages.
Concept 3; Layer and coloration imposing.
1. Hypothesis; Providing uniform mannequin models from only one angle while abusing direct overlapping LOHA tag activation, allow the AI to understand and change itself rapidly. Intentionally layered sections to be bad or lesser than average and those submissive traits will be ignored by the AI when using the score tags.
  - I hypothesized, the AI would simply fix imperfections in my LOHA layers and provide overlapping concepts.
2. Process; Each folder with a specific concept designed for layering, was introduced with only minimal details on the less important topics. One thing I did focus on, was making sure hands themselves didn’t look absolutely god-awful. Along with their minimal information in tags, they had still had some bad hands, all had simple body coloration without detail, some had bad eyes, and so on. They were full of imperfections.
3. Outcome; Semi-Successful – It works until a certain point of overfitting.
  - Conceptually this idea was interesting in theory, and the color saturation for the body parts definitely cuts through in this LOHA. More experimentation is necessary to determine the legitimate applications for this one, but so far the outcome is very promising.
  - Original;
  - Outcome;
    score_9, score_8_up, score_7_up, score_6_up,
    1girl, mature female, full body, standing, from below, from side, light smile, long hair, red hair, blue eyes, looking at viewer,
    dress, blue dress, latex, white latex, bikini, green bikini
  - As you can see the outcome speaks for itself. The concept layers overlap swimmingly, the body parts correct and position, the clothing has 360 degree angle and understanding, even though I only gave it a single angle.
  - The eyes most definitely suffer. This is due to an oversight on my part, thinking the eyes themselves would impose based on the AI. The next iteration I plan to have a folder of inpainted ADetailer eyes already imposed into it. That will provide much needed clarity to the eyes without needing a third party tool to repair them.

Overall conclusion;

This was highly successful overall and the system not only imposes the anime style from NAI, but by layering information in such a way, the core system’s characters themselves re-assert themselves with less clashing styles.

This is a character builder if you so wish it, as each seed will provide a very similar response as the others.

Original;

Anime Consistency v1 - e470 - PDXL Outcome;