That explanation makes no fucking sense and makes them look like they know fuck all about AI training.
The output keywords have nothing to do with the training data. If the model in use has fuck all BME training data, it will struggle to draw a BME regardless of what key words are used.
And any AI person training their algorithms on AI generated data is liable to get fired. That is a big no-no. Not only does it not provide any new information from the data, it also amplifies the mistakes made by the AI.
any AI person training their algorithms on AI generated data is liable to get fired
though this isn’t pertinent to the post in question, training AI (and by AI I presume you mean neural networks, since there’s a fairly important distinction) on AI-generated data is absolutely a part of machine learning.
some of the most famous neural networks out there are trained on data that they’ve generated themselves -> e.g., AlphaGo Zero
They are not talking about the training process, to combat racial bias on the training process, they insert words on the prompt, like for example “racially ambiguous”. For some reason, this time the AI weighted the inserted promt too much that it made Homer from the Caribbean.
They literally say they do this “to combat the racial bias in its training data”
to combat racial bias on the training process, they insert words on the prompt, like for example “racially ambiguous”.
And like I said, this makes no fucking sense.
If your training processes, specifically your training data, has biases, inserting key words does not fix that issue. It literally does nothing to actually combat it. It might hide issues if the data model has sufficient training to do the job with the inserted key words, but that is not a fix, nor combating the issue. It is a cheap hack that does not address the underlying training issues.
So the issue is not that they don’t have diverse training data, the issue is that not all things get equal representation. So their trained model will have biases to produce a white person when you ask generically for a “person”. To prevent it from always spitting out a white person when someone prompts the model for a generic person, they inject additional words into the prompt, like “racially ambiguous”. Therefore it occasionally encourages/forces more diversity in the results. The issue is that these models are too complex for these kinds of approaches to work seamlessly.
That explanation makes no fucking sense and makes them look like they know fuck all about AI training.
The output keywords have nothing to do with the training data. If the model in use has fuck all BME training data, it will struggle to draw a BME regardless of what key words are used.
And any AI person training their algorithms on AI generated data is liable to get fired. That is a big no-no. Not only does it not provide any new information from the data, it also amplifies the mistakes made by the AI.
though this isn’t pertinent to the post in question, training AI (and by AI I presume you mean neural networks, since there’s a fairly important distinction) on AI-generated data is absolutely a part of machine learning.
some of the most famous neural networks out there are trained on data that they’ve generated themselves -> e.g., AlphaGo Zero
They are not talking about the training process, to combat racial bias on the training process, they insert words on the prompt, like for example “racially ambiguous”. For some reason, this time the AI weighted the inserted promt too much that it made Homer from the Caribbean.
They literally say they do this “to combat the racial bias in its training data”
And like I said, this makes no fucking sense.
If your training processes, specifically your training data, has biases, inserting key words does not fix that issue. It literally does nothing to actually combat it. It might hide issues if the data model has sufficient training to do the job with the inserted key words, but that is not a fix, nor combating the issue. It is a cheap hack that does not address the underlying training issues.
So the issue is not that they don’t have diverse training data, the issue is that not all things get equal representation. So their trained model will have biases to produce a white person when you ask generically for a “person”. To prevent it from always spitting out a white person when someone prompts the model for a generic person, they inject additional words into the prompt, like “racially ambiguous”. Therefore it occasionally encourages/forces more diversity in the results. The issue is that these models are too complex for these kinds of approaches to work seamlessly.
congratulations you stumbled upon the reason this is a bad idea all by yourself
all it took was a bit of actually-reading-the-original-post
?
My position was always that this is a bad idea.