Verbal nonsense reveals limitations of AI chatbots

The era of artificial intelligence (AI) chatbots that appear to understand and use language in a human-like manner has dawned. These chatbots rely on large language models, a type of neural network. However, a recent study has revealed a vulnerability in these large language models, as they can sometimes mistake nonsense for natural language. Researchers at Columbia University see this flaw as an opportunity to enhance chatbot performance and gain insights into how humans process language.

In their paper published in Nature Machine Intelligence, the scientists describe how they conducted experiments using nine different language models. They presented hundreds of pairs of sentences to human participants and asked them to select the sentence they believed sounded more natural, i.e., the one more likely to be encountered in everyday communication. The researchers then evaluated whether the AI models would provide the same judgments as the human participants.

In head-to-head comparisons, the more advanced AI models based on transformer neural networks generally outperformed simpler models, such as recurrent neural networks and statistical models that rely on word pair frequencies from the internet or online databases. However, all models exhibited errors, occasionally selecting sentences that sounded like gibberish to humans.

Dr. Nikolaus Kriegeskorte, a principal investigator at Columbia’s Zuckerman Institute and a coauthor of the paper, noted, “That some of the large language models perform as well as they do suggests that they capture something important that the simpler models are missing. That even the best models we studied still can be fooled by nonsense sentences shows that their computations are missing something about the way humans process language.”

For example, consider the following sentence pair:

  1. That is the narrative we have been sold.
  2. This is the week you have been dying.

Human participants in the study judged the first sentence as more natural. However, BERT, one of the advanced models, rated the second sentence as more natural, while GPT-2, another widely known model, correctly identified the first sentence as more natural, aligning with human judgments.

Christopher Baldassano, an assistant professor of psychology at Columbia and the senior author of the study, emphasized that all models had blind spots and labeled some sentences as meaningful when human participants considered them gibberish. He cautioned against relying too heavily on AI systems for important decisions, at least in their current state.

One of the intriguing findings of the study is the good yet imperfect performance of many models. Dr. Kriegeskorte emphasized the importance of understanding why these gaps exist and why certain models outperform others, as this knowledge can drive progress in language models.

The researchers are also curious about whether the computations in AI chatbots can inspire new scientific questions and hypotheses, potentially guiding neuroscientists toward a better understanding of human brain function. Analyzing the strengths and weaknesses of various chatbots and their underlying algorithms may contribute to answering this question.

Tal Golan, the paper’s corresponding author, who recently established his own lab at Ben-Gurion University of the Negev in Israel, highlighted the interest in understanding how people think and the unique processing of language by AI tools, offering a fresh perspective on human cognition.

Posted in

Aihub Team

Leave a Comment





Interview Mrs.Anita Schjøll Brede

Interview Mrs.Anita Schjøll Brede

Interview with Mr.Jürgen Schmidhuber

Interview with Mr.Jürgen Schmidhuber

Interview with Mr.Fei-Fei Li

Interview with Dr.Fei-Fei Li

AI and Music Composition: The intersection of AI and creativity in composing music.

AI and Music Composition: The intersection of AI and creativity in composing music.

AI in Art Authentication: AI techniques for art forgery detection and provenance verification.

AI in Art Authentication: AI techniques for art forgery detection and provenance verification.

AI for Accessibility: How AI is making technology more accessible for individuals with disabilities.

AI for Accessibility: How AI is making technology more accessible for individuals with disabilities.

AI in Retail Personalization: Customizing shopping experiences with AI-driven recommendations.

AI in Retail Personalization: Customizing shopping experiences with AI-driven recommendations.

AI in Supply Chain Management: AI-driven optimization of supply chain logistics and inventory management.

AI in Supply Chain Management: AI-driven optimization of supply chain logistics and inventory management.

AI in Veterinary Medicine: AI applications for animal health diagnosis and treatment.

AI in Veterinary Medicine: AI applications for animal health diagnosis and treatment.

AI and Genome Sequencing: AI's contribution to accelerating genomic research and precision medicine.

AI and Genome Sequencing: AI’s contribution to accelerating genomic research and precision medicine.

AI and Drone Technology: AI's role in enhancing drone capabilities for various industries.

AI and Drone Technology: AI’s role in enhancing drone capabilities for various industries.

AI in Transportation: Innovations in autonomous vehicles and AI for traffic management.

AI in Transportation: Innovations in autonomous vehicles and AI for traffic management.

AI in Environmental Monitoring: AI applications for monitoring air and water quality.

AI in Environmental Monitoring: AI applications for monitoring air and water quality.

AI in Criminal Justice: AI's impact on crime prevention, offender profiling, and legal analytics.

AI in Criminal Justice: AI’s impact on crime prevention, offender profiling, and legal analytics.

AI for Elderly Care: Enhancing senior care with AI-powered health monitoring and companionship.

AI for Elderly Care: Enhancing senior care with AI-powered health monitoring and companionship.

AI and Disaster Prediction: Predicting natural disasters using AI-based models and algorithms.

AI and Disaster Prediction: Predicting natural disasters using AI-based models and algorithms.

IGN, the popular gaming website, is introducing an AI tool aimed at simplifying troubleshooting and enhancing gameplay experiences. This innovation has the potential to alleviate the need for specific Google searches and extensive searches through online communities like Reddit. Currently available for IGN's The Legend of Zelda: Tears of the Kingdom guide, the chatbot offers assistance during gameplay. While currently accessible to everyone, IGN accounts will be required in the future to utilize the chatbot. In its current alpha release testing phase, the chatbot draws from various sources, including guides, tips, content published on IGN, and insights from contributors' gameplay experiences. The purpose of this chatbot is to provide swift solutions to intricate challenges and problems, presenting immediate assistance without the need to navigate multiple pages. IGN envisions this guides feature as a comprehensive and convenient solution for gamers seeking quick answers and resolutions. Although primarily targeted towards gamers, the chatbot can serve as a valuable resource for newcomers as well. Questions posed to the chatbot, such as inquiries about the beginner-friendliness of Tears of the Kingdom, yield fitting responses, even though occasional delays in its responses have been observed. IGN's introduction of this AI tool demonstrates a stride towards enhancing gaming experiences, streamlining problem-solving processes, and fostering a more enjoyable and engaging environment for gamers.

IGN launched an AI chatbot for its game guides

Criminals Have Created Their Own ChatGPT Clones

Criminals Have Created Their Own ChatGPT Clones

Amid growing concerns and increased scrutiny, the Detroit Police Department (DPD) faces yet another lawsuit, shedding light on yet another wrongful arrest resulting from a flawed facial recognition match. The latest victim, Porcha Woodruff, an African American woman who was eight months pregnant at the time, has become the sixth individual to step forward and reveal that they were wrongly implicated in a crime due to the controversial technology employed by law enforcement. Woodruff found herself accused of robbery and carjacking, an accusation she found incredulous, especially given her visibly pregnant state. This disturbing trend of wrongful arrests stemming from inaccurate facial recognition matches has raised serious alarms, particularly given that all six reported victims, as identified by the American Civil Liberties Union (ACLU), have been African Americans. Notably, Woodruff's case stands out as the first instance involving a woman. This incident marks the third known instance of a wrongful arrest within the past three years attributed specifically to the Detroit Police Department's reliance on faulty facial recognition technology. In a separate case, Robert Williams has an ongoing lawsuit against the DPD, represented by the ACLU of Michigan and the University of Michigan Law School’s Civil Rights Litigation Initiative (CRLI), stemming from his wrongful arrest in January 2020 due to the same flawed technology. Phil Mayor, Senior Staff Attorney at ACLU of Michigan, expressed deep concern over the situation, emphasizing that despite being aware of the serious repercussions of using flawed facial recognition technology for arrests, the Detroit Police Department continues to employ it. The usage of facial recognition technology by law enforcement has sparked heated debates due to concerns over accuracy, potential racial bias, and possible infringements on privacy and civil liberties. Studies have consistently shown that these systems exhibit higher error rates when identifying individuals with darker skin tones, disproportionately affecting marginalized communities. Critics argue that relying solely on facial recognition for making arrests poses significant risks, leading to grave consequences for innocent individuals, as exemplified by Woodruff's case. Calls for transparency and accountability have escalated, with civil rights organizations demanding that the Detroit Police Department cease using facial recognition technology until it can be rigorously evaluated and proven to be both unbiased and accurate. As the case unfolds, the public remains vigilant, awaiting the Detroit Police Department's response to mounting pressure to address concerns surrounding the misapplication of facial recognition technology and its impact on the rights and lives of innocent individuals.

Error-prone facial recognition leads to another wrongful arrest

A team of researchers from The University of Texas at Austin has enhanced a commercial virtual reality headset to incorporate brain activity measurement capabilities, enabling the study of human reactions to stimuli like hints and stressors. By integrating a noninvasive electroencephalogram (EEG) sensor into a Meta VR headset, the research team has developed a comfortable and wearable device for long-term use. The EEG sensor captures the brain's electrical signals during immersive virtual reality interactions. This innovation holds diverse potential applications, ranging from aiding individuals with anxiety to assessing the attention and mental stress levels of pilots using flight simulators. Additionally, it allows individuals to perceive the world through a robot's eyes. Nanshu Lu, a professor at the Cockrell School of Engineering's Department of Aerospace Engineering and Engineering Mechanics, who led the research, emphasized the heightened immersion of virtual reality and the ability of their technology to yield improved measurements of brain responses within such environments. Although the combination of VR and EEG sensors exists in the commercial domain, the researchers note that current devices are expensive and less comfortable for users, thus limiting their usage duration and applications. Addressing these challenges, the team designed soft, conductive, and spongy electrodes that overcome issues related to traditional electrodes. These modified VR headsets integrate these electrodes into the top strap and forehead pad, utilizing a flexible circuit with conductive traces similar to electronic tattoos, along with an EEG recording device attached to the headset's rear. This technology aligns with a larger research initiative at UT Austin focused on a robot delivery network, which will also facilitate an extensive study of human-robot interactions. The VR headsets, enhanced with EEG capabilities, will enable observers to experience events from a robot's perspective and simultaneously measure the cognitive load of prolonged observations. To validate the effectiveness of the VR EEG headset, the researchers developed a driving simulation game. Collaborating with José del R. Millán, an expert in brain-machine interfaces, the team created a scenario where users respond to turn commands by pressing a button, and the EEG records brain activity to assess their attention levels. The researchers have initiated preliminary patent procedures for their EEG technology and are open to collaborations with VR companies to integrate their innovation directly into VR headsets. The research team includes experts from various departments such as Electrical and Computer Engineering, Aerospace Engineering and Engineering Mechanics, Mechanical Engineering, Biomedical Engineering, and Artue Associates Inc. in South Korea.

Modified virtual reality tech can measure brain activity

Today in AI: Alibaba open-sources two AI models, AI-based HYRGPT eliminates the first two steps of hiring and more

Today in AI: Alibaba open-sources two AI models, AI-based HYRGPT eliminates the first two steps of hiring and more

AI and Space Exploration: The role of AI in space research and robotics.

AI and Space Exploration: The role of AI in space research and robotics.

AI and Sports Analytics: Enhancing performance analysis and player insights with AI.

AI and Sports Analytics: Enhancing performance analysis and player insights with AI.

AI and Virtual Reality: The synergy between AI and virtual reality technologies.

AI and Virtual Reality: The synergy between AI and virtual reality technologies.

AI for Mental Health: How AI is aiding in early detection and treatment of mental health conditions.

AI for Mental Health: How AI is aiding in early detection and treatment of mental health conditions.

AI in Disaster Response: Utilizing AI for real-time disaster monitoring and relief efforts.

AI in Disaster Response: Utilizing AI for real-time disaster monitoring and relief efforts.

AI in Fashion Design: AI-driven tools for fashion trend forecasting and personalized styling.

AI in Fashion Design: AI-driven tools for fashion trend forecasting and personalized styling.

AI in Human Resources: Streamlining HR processes with AI-driven talent acquisition and management.

AI in Human Resources: Streamlining HR processes with AI-driven talent acquisition and management.

AI in Language Translation: Advancements in AI-driven language translation services.

AI in Language Translation: Advancements in AI-driven language translation services.

AI in Gaming: Exploring AI's role in video game development and player experiences.

AI in Gaming: Exploring AI’s role in video game development and player experiences.