Is a kid just a 60% reduction by volume of an adult? And these are generative algorithms… nobody really understands how it perceives the world and word relations.
It understands young and old. That means it knows a kid is not just a 60% reduction by volume of an adult.
We know it understands these sorts of things because of the very things this whole kerfuffle is about - it’s able to generate images of things that weren’t explicitly in its training set.
But it doesn’t fully understand young and “naked young person” isn’t just a scaled down “naked adult”. There are physiological changes that people go through during puberty which is why the “It understands young vs. old” is a clearly vapid and low effort comment. Yours has more meaning behind it so I’d clarify that just being able to have a vague understanding of young and old doesn’t mean it can generate CSAM.
But it doesn’t fully understand young and “naked young person” isn’t just a scaled down “naked adult”.
Do you actually know that, or are you just assuming it?
Personally, I’m basing my assertions off of experience with related situations, where I’ve asked image AIs to generate images of things that I’m quite sure weren’t in its training set and that require conceptual understanding to create “hybrids.” It’s done a decent job of those so I’m assuming that it can figure out this specific situation as well, since most of these models have a lot of examples of naked people and young people in their training sets. But I haven’t actually asked any AIs to generate images of naked young people to test this one specific case.
My opinion here is that “naked young person” isn’t as simple as other compound concepts because there are physiological changes we go through during puberty that an AI can’t reverse engineer. Something like “Italian samurai” involves concepts that occur at a surface level that it can easily understand while “naked young person” involves some components that can’t be derived simply from applying “young” to “naked person” or “naked” to “young person”.
Well, I haven’t gone to any of my image AIs and actually asked them to generate naked pictures of young people. So unless you want to go there this will necessarily involve some degree of theoretical elements.
However, according to the article it’s possible to generate this stuff with Stable Diffusion models, and Stable Diffusion models have a negligible amount of CSAM in the training set. So short of actually doing the experiment that would seem to settle it.
I think a lot of people don’t appreciate just how surprisingly sophisticated the “world model” that these image AIs have learned is. There was a paper a while back where some researchers were trying to analyze how image generators were working internally, and they discovered that if you were to for example ask one to make a picture of a bicycle it will first come up with a depth map of the image before it starts doing anything to the visual output. That shows that the AI has figured out what the three-dimensional form of a bicycle is based entirely on a pile of two-dimensional training images, with no other clues telling it that the third dimension even exists in the first place.
Is a kid just a 60% reduction by volume of an adult? And these are generative algorithms… nobody really understands how it perceives the world and word relations.
It understands young and old. That means it knows a kid is not just a 60% reduction by volume of an adult.
We know it understands these sorts of things because of the very things this whole kerfuffle is about - it’s able to generate images of things that weren’t explicitly in its training set.
But it doesn’t fully understand young and “naked young person” isn’t just a scaled down “naked adult”. There are physiological changes that people go through during puberty which is why the “It understands young vs. old” is a clearly vapid and low effort comment. Yours has more meaning behind it so I’d clarify that just being able to have a vague understanding of young and old doesn’t mean it can generate CSAM.
Do you actually know that, or are you just assuming it?
Personally, I’m basing my assertions off of experience with related situations, where I’ve asked image AIs to generate images of things that I’m quite sure weren’t in its training set and that require conceptual understanding to create “hybrids.” It’s done a decent job of those so I’m assuming that it can figure out this specific situation as well, since most of these models have a lot of examples of naked people and young people in their training sets. But I haven’t actually asked any AIs to generate images of naked young people to test this one specific case.
My opinion here is that “naked young person” isn’t as simple as other compound concepts because there are physiological changes we go through during puberty that an AI can’t reverse engineer. Something like “Italian samurai” involves concepts that occur at a surface level that it can easily understand while “naked young person” involves some components that can’t be derived simply from applying “young” to “naked person” or “naked” to “young person”.
Someone did have a valid counter argument in this subthread though: https://sh.itjust.works/comment/11713795
Well, I haven’t gone to any of my image AIs and actually asked them to generate naked pictures of young people. So unless you want to go there this will necessarily involve some degree of theoretical elements.
However, according to the article it’s possible to generate this stuff with Stable Diffusion models, and Stable Diffusion models have a negligible amount of CSAM in the training set. So short of actually doing the experiment that would seem to settle it.
I think a lot of people don’t appreciate just how surprisingly sophisticated the “world model” that these image AIs have learned is. There was a paper a while back where some researchers were trying to analyze how image generators were working internally, and they discovered that if you were to for example ask one to make a picture of a bicycle it will first come up with a depth map of the image before it starts doing anything to the visual output. That shows that the AI has figured out what the three-dimensional form of a bicycle is based entirely on a pile of two-dimensional training images, with no other clues telling it that the third dimension even exists in the first place.