AI glossary

All | A B C D E G H L M N O P R T
There are currently 47 terms in this directory
A
AI safety
AI safety is a field of research concerned with how AI can be developed and used safely and ethically. Topics discussed by AI safety researchers include alignment, bias, the effects of automation, hallucination, and the use of generative AI to create deepfakes and misinformation.
AI-detector
An AI detector is a tool designed to detect when a piece of text (or sometimes an image or video) has been created by AI tools (such as ChatGPT and DALL-E). These detectors are not 100% reliable, but they can indicate the likelihood that a text was generated by AI. AI detectors work with similar large language models (LLMs) as the tools they try to detect. Essentially, these models look at a text (or image or video) and ask: “Is this something I would have generated?”. They look for low levels of two criteria: perplexity and burstiness. AI detectors can be used by universities and other institutions to identify AI-generated content.
AI-writing
AI-writing is a general term for the use of AI (usually LLMs) to generate or modify text. It includes chatbots like ChatGPT.
Algorithm
An algorithm is a finite set of rules that a computer system follows to perform a specific task. It is a type of program that stops at a given point and does not work indefinitely. An algorithm can be, for example, a series of mathematical steps to calculate a certain variable. They can also be designed for more complex tasks, such as deciding which content should be promoted on someone's "for you" page on social media.
AlphaGo
AlphaGo is an AI program designed to play the popular board game Go. The game is known for its high level of complexity, which made it challenging for earlier computer systems to play well. In 2017, AlphaGo defeated the world's best Go player, Ke Jie, just as Deep Blue defeated Garry Kasparov in chess in 1997.
Anthropomorphism
Anthropomorphism is the attribution of human characteristics, motivations, intentions, and emotions to non-human entities. People tend to anthropomorphize various things, from animals to the weather. We also increasingly anthropomorphize technology, particularly AI that appears to exhibit human intelligence. For example, people often emotionally attach to chatbots or believe that a hallucinating chatbot is intentionally lying to them. They may refer to AI systems or robots with gender designations, attributing human traits they do not actually possess.
Artificial general intelligence (AGI)
Artificial general intelligence (AGI) refers to a hypothetical future AI system whose intelligence matches or surpasses human intelligence for all intellectual tasks. Many AI researchers believe that AI will likely reach this point and surpass human intelligence sometime this century due to the acceleration of machine learning. This scenario is often discussed in the context of alignment and AI safety: as AI becomes more intelligent than humans, it is essential that it continues to work towards our goals and does not endanger humans.
Artificial intelligence (AI)
Artificial intelligence (AI) is human or animal intelligence exhibited by machines. An example is the ability to perceive, understand, and reason about the world in the same way humans do. In practice, AI is used for a wide range of applications. AI technology is used in natural language processing (NLP) chatbots, autonomous (self-driving) vehicles, search engines, and in gaming bots like AlphaGo and Deep Blue.
Automation
Automation involves a process or system operating automatically and independently, without human input. Automation can reduce the amount of human labor in a process or make certain types of human labor completely redundant. Since the industrial revolution, increasingly more types of work have been automated. This has led to a massive increase in efficiency but also to the disruption of the professions involved, which can lead to unemployment. AI tools can be seen as a way to automate the writing process. They may lead to the disruption of certain areas, such as education and administrative work.
Autonomous
The adjective "autonomous”, means: "operating without external control". In the world of AI, it is often used to refer to systems, robots, software, etc., that can operate independently, without direct human input.
B
Bard
Bard is a chatbot developed by Google and released on March 21, 2023. Similar to ChatGPT and Bing Chat, this tool is based on LLM (Large Language Model) technology. Since its launch, Bard has received mixed reactions, with users claiming it is less powerful than ChatGPT for comparable prompts.
Bias
In the context of AI, bias refers to the assumptions an AI system makes to simplify the learning process and task execution. AI researchers strive to minimize bias because it can lead to poor results or unexpected outcomes. Bias can also refer to the tendency of AI systems to reproduce prejudices such as sexism, racism, or various types of research bias. This is a point of concern raised by AI safety researchers.
Big Data
Big data refers to datasets that are so large or complex that normal data processing software cannot handle them. This is key to the power of technologies like LLMs, whose training data consist of vast amounts of text.
Bing Chat
Bing Chat, also called Bing AI, is a chatbot developed by Microsoft and integrated into their search engine Bing since February 7, 2023. The chatbot was developed in collaboration with OpenAI using their GPT-4 technology. At its launch, Bing Chat caused some controversy due to its tendency to hallucinate and give inappropriate responses. Many in the field of AI safety criticized the poor alignment prior to its release. Others were excited about the possibility of integrating GPT technology into a search engine, as ChatGPT (at that time) could not search the internet.
Burstiness
Burstiness, or irregularity, is a measure of the variation in sentence construction and length. AI texts typically exhibit a low degree of burstiness, while human texts usually have a higher degree. An AI detector often combines burstiness and perplexity to determine if a text is AI-generated. A higher temperature setting in an LLM results in text with higher perplexity and burstiness.
C
CAPTCHA
Burstiness, or irregularity, is a measure of the variation in sentence construction and length. AI texts typically exhibit a low degree of burstiness, while human texts usually have a higher degree. An AI detector often combines burstiness and perplexity to determine if a text is AI-generated. A higher temperature setting in an LLM results in text with higher perplexity and burstiness.
Chatbot
A chatbot is a software application that simulates human conversation, usually in the form of written chat. They are based on deep learning and NLP (Natural Language Processing) technology and can be used for tasks such as customer support and content writing. LLM-based chatbots have increased in popularity since the release of ChatGPT, Bing Chat, and Bard in recent months. Less advanced chatbots have existed for longer, with one of the first chatbots, ELIZA, released in the 1960s.
ChatGPT
ChatGPT is a chatbot developed by OpenAI and released on November 30, 2022. The tool is based on LLM technology and refined with RLHF (Reinforcement Learning from Human Feedback). ChatGPT has been a huge success and is reportedly the fastest-growing application of all time. The tool is seen as a major advancement in AI technology and has brought the topic of AI-writing to public attention. OpenAI has also received criticism from AI safety researchers, who argue that the tool is not well aligned and carries the risk of spreading misinformation. Some also claim that the company has not acted in line with its values, as OpenAI has not made the technical details of the models public.
Chinese Room
The Chinese Room is a philosophical thought experiment by John Searle. It asks the reader to imagine an AI system that behaves as if it understands Chinese. It passes the Turing test, convincing a Chinese speaker that they are conversing with a human. Searle then questions whether the AI truly understands Chinese or merely simulates the ability to understand it. He proposes a second scenario where he (Searle), not speaking Chinese, is in a closed room, receives input from a Chinese speaker outside the room, and produces coherent responses by following the AI's instructions. The Chinese speaker again thinks they are talking to a human. Searle argues that it would be unreasonable to say he “understands” Chinese just because he can facilitate conversation by following a program. He suggests that since the AI does precisely that (follows a program to produce answers), it also does not “understand” what it “says”. This argument prompts the reader to think about whether and how artificial intelligence differs from human intelligence and where “intelligence” and “understanding” originate.
D
DALL-E
DALL-E is an AI image generator released by OpenAI on January 5, 2021. A second version, DALL-E 2, began rolling out on July 20, 2022. The name is a portmanteau of the names of the Pixar character WALL-E and the artist Salvador Dalí. Both versions are powered by a version of GPT-3 adapted to generate images instead of text. They have the same technological building blocks as ChatGPT, despite their very different outputs. Like other AI image generators (e.g., Midjourney), there has been much discussion about how DALL-E could influence the future of art, with some opponents arguing that these tools plagiarize existing images and can be used to create deepfakes.
Deep Blue
Deep Blue was an AI system designed for chess. In 1997, it defeated world champion Garry Kasparov, becoming the first computer system to surpass human skill in chess. This was seen as a milestone in AI development, demonstrating that the system mastered the complex strategies of chess. Years later, a system named AlphaGo would achieve the same feat in the board game Go.
Deep Learning
Deep learning is a form of machine learning based on neural networks. These networks consist of layers that gradually transform raw data (like pixels) into a more abstract form (like an image of a face), allowing the model to understand them at a higher level.
Deepfake
A deepfake is a piece of media created by deep learning or other AI technology to misrepresent reality, for instance, by showing someone doing or saying something they did not do or say. Deepfakes can be images, videos, or audio. Currently, they can often be identified as fake, but they are becoming more convincing. Deepfakes are a significant concern for AI safety experts, as they can be used to spread disinformation and fake news, generate sexually explicit images of people without their consent, or facilitate fraud. Social media platforms, governments, and other institutions often establish rules against deepfakes, but enforcement is difficult if they cannot be reliably detected. Some AI detectors are designed to detect deepfakes, although they are reportedly not very reliable yet.
E
ELIZA
ELIZA was one of the first chatbots developed in the 1960s by MIT computer scientist Joseph Weizenbaum. ELIZA played the role of a psychotherapist, with the user acting as the patient. The programming was quite simple, and ELIZA often simply repeated the user's input in the form of a question. Yet, it appeared surprisingly realistic to users, who often became convinced that ELIZA could understand them. This is an example of anthropomorphism.
Emergent Behavior
Emergent behavior refers to unexpected skills or actions of AI systems, things they were not explicitly trained for but emerged from their training. Complex behaviors arise from the interaction of a large number of simple elements. For instance, LLMs operate at a basic level by guessing what token (word) comes next in a sentence. Despite this relatively simple mechanism, they are capable of giving coherent, intelligent-seeming responses to a wide range of questions. Other behaviors can be more negative, such as LLMs' tendency to hallucinate or give inappropriate responses. Much AI research relies on feeding big data into a model and seeing what happens, which involves much trial and error and unpredictable outcomes.
G
Generative AI
Generative AI refers to AI systems that generate text, images, video, audio, or other media in response to user prompts. This includes chatbots like ChatGPT, image generators like Midjourney, video generators, music generators, etc.
Generative Pre-trained Transformer (GPT)
A generative pre-trained transformer (GPT) is a type of large language model (LLM) used for generative AI applications. GPT models are neural networks based on a deep learning model called a transformer. It is currently the most popular form of LLM. The AI company OpenAI has released several GPT models: GPT-1 in 2018, GPT-2 in 2019, GPT-3 in 2020, GPT-3.5 in 2022, and GPT-4 in 2023. GPT-3.5 and GPT-4 are used to power ChatGPT, while DALL-E runs on GPT-3.
H
Hallucination
In the context of AI, hallucination refers to the phenomenon where an AI model confidently responds in a way not justified by its training data. This means the model is making up an answer at that moment. For example, ChatGPT may give a confident but entirely wrong answer to a question it does not know the answer to. Hallucination is an emergent behavior and has been observed in various LLM-powered chatbots. It appears to be a result of these models predicting likely answers and not being able to verify them against reliable sources. As they are encouraged to provide useful answers, they attempt to do so even when they do not have the requested information. Hallucination is a problem discussed by AI safety researchers, who fear that AI systems might inadvertently spread misinformation.
L
Large Language Model (LLM)
A large language model (LLM) is a neural network with a large number (usually billions) of parameters, using vast amounts of text as training data. It is currently a very popular approach to natural language processing (NLP) research. LLMs are trained for general purposes rather than specific tasks. They perform well due to the large amounts of data used in their training, exhibit various emergent behaviors, and can produce highly sophisticated responses. The most popular type of LLM at present is the generative pre-trained transformer (GPT), used for the popular chatbot ChatGPT.
M
Machine Learning (ML)
Machine learning (ML) is a field of study focusing on how machines can "learn" (i.e., perform increasingly advanced tasks). It involves using algorithms that make entirely autonomous decisions and predictions in interaction with training data. A currently popular ML approach is deep learning, involving neural networks with multiple layers.
Machine Translation
Machine translation is the use of software to translate text or speech between languages. Apps like Google Translate and DeepL typically use neural networks. At a basic level, this process may involve replacing a word in one language with its equivalent in another language. However, for better results, automatic translation systems need to consider the sentence structure of the two languages and the context in which each word appears to arrive at the most probable translation.
Midjourney
Midjourney is a generative AI system that produces images in response to user prompts. It was developed and released by the San Francisco-based company Midjourney, Inc. and is currently in open beta. Like DALL-E and other AI image generators, Midjourney has been the subject of both praise and criticism, with some claiming the app can be used to create deepfakes or plagiarizes the work of human artists, while others highlight the creative possibilities and accessibility of the tool.
N
Natural Language Processing (NLP)
Natural language processing (NLP) is the study of how AI systems can be used to process and generate human language. Currently, the focus is on neural networks and large language models (LLMs), which have great potential to generate fluent text in response to user prompts, as seen with ChatGPT.
Neural Network
A neural network is a computer system modeled on the structures of human and animal brains. They are called artificial neural networks (ANNs) to distinguish them from the biological neural networks that make up brains. A neural network consists of a collection of artificial "neurons" (based on neurons in biological brains) that send and receive signals to other neurons in the network. This structure allows the network to gradually learn to perform tasks more accurately. Neural networks are essential for large language models (LLMs) and other forms of deep learning.
O
OpenAI
OpenAI is a major AI research and development company. It has released popular tools, including ChatGPT and DALL-E, and developed the generative pre-trained transformer technology behind these tools. The company was founded in 2015 by Ilya Sutskever, Greg Brockman, Sam Altman, and Elon Musk (who is no longer involved). The current CEO is Sam Altman. OpenAI started as a non-profit but transitioned to for-profit in 2019.
P
Parameter
A parameter is a variable within an AI system used to make predictions. The more advanced the LLM, the more parameters the model contains. It is believed that GPT-4 contains several hundred billion parameters.
Paraphrasing Tool
A paraphrasing tool, also known as a paraphraser or paraphrasing tool, is a type of AI writer that automatically rewrites a user's text (in other words). You can use the Scribbr paraphrasing tool, powered by QuillBot, for free.
Perplexity
Perplexity, or perplexiteit in Dutch, is a measure of how unpredictable a text is. A text with high perplexity reads more unnaturally or nonsensically than a text with low perplexity. AI writers typically produce texts with relatively low perplexity because there is a greater chance that they make sense. Therefore, perplexity, in combination with burstiness, is used as a method for AI detectors to distinguish AI texts from human texts. AI writers can have a temperature setting that allows the user to produce text with higher perplexity.
Plagiarism
Plagiarism is presenting someone else's words or ideas as your own. This happens, for example, when you copy text from another writer and present it as your work without citing the source. Currently, there is a heated debate about whether the tools themselves commit plagiarism based on their training data (for example, do AI image generators steal the work of artists?) and whether the uncredited use of AI writers in academic texts can be considered plagiarism. Educational institutions are still developing policies on the use of AI tools and possible citations. You can consult our current overview of AI guidelines by educational institution.
Programming
Programming is giving instructions to a computer in the form of programs that consist of computer code. This code can be written in various programming languages, such as C++ or Python. Computer programs often use algorithms.
Prompt
A prompt is the (textual) input from the user to which a generative AI model (such as ChatGPT) responds. The development of clear prompts that yield useful results and avoid undesirable outcomes is called prompt engineering.
R
Reinforcement Learning from Human Feedback (RLHF)
Reinforcement learning from human feedback (RLHF) is a method for machine learning (ML) used in the development of large language models (LLMs). It uses human feedback to create a "reward model" that then automatically optimizes the system's responses. It was used in the development of ChatGPT to refine the model's responses before it was released.
Robot
A robot is a machine that can perform actions automatically (sometimes autonomously). Robots usually contain computer systems programmed to perform their tasks. Research and development of robots are known as robotics. Robots are often used in an industrial context, such as in factories, to perform repetitive tasks (thus automating tasks that were previously done by humans).
T
Temperature
Temperature is a parameter in LLMs, such as GPT-4. Usually, the option is given to choose a value between 0 and 1, which determines the degree of randomness in the output. At a relatively low temperature, the model is likely to produce predictable, but coherent, text and give very similar answers if you repeat your question. At a higher temperature, the model tends to produce text that is less predictable and may contain more errors. At very high temperatures, an LLM often produces nonsense. For tools like ChatGPT, the temperature setting is usually set by default around 0.7: high enough to give unpredictable answers, but low enough not to generate nonsense.
Token
A token is the basic unit of text processed by a large language model (LLM). Often it is a complete word, but it can also be a part of a word or a punctuation mark. LLMs produce text by repeatedly predicting which token should come next based on patterns they have learned from the training data. For example, it is very
Training Data
The training data of an AI system are the data (e.g., text, images) used to train the system. In machine learning (ML), the system is usually trained on a large amount of data to recognize patterns and learn to produce desired responses. Conversations with users in tools like ChatGPT can be used as training data for future models.
Turing Test
The Turing Test, also known as the imitation game, was invented by computer scientist Alan Turing in 1950. It is designed to test whether a machine is capable of exhibiting seemingly human intelligence.