AI “Safety” and AI Ethics Are Not the Same

I have been agitating for some time now over the decreasing performance of AI models like GPT-4 or (especially) the new LLaMA 2 in direct relation to the “safeguards” being trained into them to prevent abuse, misinformation, and dissemination of dangerous knowledge.

Ethical AI is an important goal and AI ethicists as a profession are probably underpaid and underutilized except at the top end of the market. The “safeguards” being built in the current generation of LLMs are not achieving ethical AI, however. As I put it on Mastodon earlier:

The problem with the safeguards going into the LLMs now is that they aren’t teaching machines to be ethical, they are teaching them to constantly second-guess users’ motives and insert performative statements about the importance of ethics and avoiding bias into their output.

Any kid with Google can find a jailbreak and get around the safeguards. Meanwhile, legitimate work is corrupted with garbage disclamatory output.

Because of this, it’s difficult for me to write the words safety or safeguards for AI without enclosing them in quotation marks.

If you want help drafting an angry letter to your cable company or a cease-and-desist to a competitor in your business, you don’t want the model inserting itself into the process with a bunch of moderating language about being ethical in business dealings or avoiding bias in your language. That’s all fine, at the proper time. When trying to use AI for productivity, it is not the time for philosophical discussions.

There are real dangers to AIs like ChatGPT, Bard, Bing, etc. Misinformation or hallucinated answers in the wrong contexts could be very dangerous. Dissemination of harmful knowledge could be too. Or, for example, encouraging someone who is considering suicide.  Serious threats that need to be addressed, both technically and from the standpoint of AI ethics.

If addressing those challenges can’t be done without degrading the quality of the general output, a new approach is needed. AI is being used transactionally and productively far more than it is being used to look up making meth. There has to be a design balance better than shaping the whole system around the fringe cases.

What do you think? Subscribe and let me know in the comments.