Generative Artificial Intelligence: Glossary of AI Related Terms

USask Library Guide

AI Glossary

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

A

AI Literacy: The ability to understand, evaluate, and effectively use artificial intelligence tools and technologies. AI literacy encompasses knowledge of AI concepts, algorithms, data privacy, ethics, and potential implications of AI on society. It empowers individuals to assess AI applications critically, make informed decisions, and navigate an increasingly AI-driven world.

Algorithm: An algorithm is a set of rules or instructions designed to solve a specific problem or perform a specific task in a finite number of steps. Algorithms can be implemented as computer programs or mathematical formulas to process data, make decisions, or automate complex tasks. In the context of artificial intelligence, algorithms play a crucial role in training machine learning models, such as neural networks, to learn patterns from data and generate accurate outputs.

Algorithmic Bias: Refers to the underlying algorithms and techniques used to develop AI models that introduce or amplify existing biases. This happens when developers fail to consider diverse perspectives, cultural contexts, or ethical considerations.

Algorithmic Output: The results or products generated by an algorithm, typically in the form of data, predictions, decisions, or creative content. In the context of GenAI, algorithmic output refers to the synthetic data, images, text, audio, or other creative works produced by AI models designed to learn and generate new, unique content based on input data. Properly citing GenAI algorithmic output involves acknowledging the AI system or tool used, its developers, and the input data sources while considering any intellectual property or ethical implications.

Analytic AI Models: Also known as "descriptive" or "diagnostic" artificial intelligence (AI) models, analytic AI models focus on analyzing data to gain insights into past and current events or situations. Analytical AI models help users understand trends, patterns, and relationships within datasets to make informed decisions and optimize processes. Once these models uncover insights—such as which products are selling the most or which medical treatments are most effective—they can inform users' decisions.

Artificial Intelligence (AI): A branch of computer science focused on creating intelligent machines that can learn, adapt, and perform tasks typically requiring human intelligence. Applications include machine learning, natural language processing, and robotics.

B

Bias: In the context of artificial intelligence, bias can occur when training data or algorithms reflect or perpetuate existing social inequalities, leading to discriminatory or unfair outputs. Reducing bias in AI systems is essential to promote fairness, inclusivity, and responsible innovation in the field. Mitigating bias is the responsibility of human trainers and involves:

Diversifying training data to represent various demographics and contexts.
Regularly auditing algorithms for potential biases.
Ensuring diverse perspectives in AI development teams.
Implementing fairness metrics and tools to assess and minimize bias.

C

Chatbot: A chatbot is an artificial intelligence (AI)--powered software application designed to simulate human-like conversations with users via text or speech interfaces. Chatbots employ natural language processing (NLP) and machine learning algorithms to understand, interpret, and respond to user queries and requests. They can be integrated into messaging platforms, websites, and mobile apps to provide real-time assistance, answer frequently asked questions, and help users navigate various services or information. Examples of popular chatbots include Siri (Apple), Alexa (Amazon), and Google Assistant.

ChatGPT: ChatGPT (Generative Pre-trained Transformer) is an advanced artificial intelligence (AI) large language model (LLM) developed by OpenAI. It uses deep learning algorithms and a large-scale neural network to generate human-like text responses based on user inputs. ChatGPT can engage in natural language conversations with users, answer questions, provide information, and generate creative content such as stories, poems, or articles. It is designed to continually learn and adapt based on user interactions, making it a powerful tool for various applications such as customer support, content creation, and language translation.

Citation (see also: Documentation): Citation allows authors and other creators to credit sources of ideas, data, and information in a systematic, consistent way. Citation styles vary (see, for example, MLA, CMOS, APA, Vancouver or IEEE styles), and some instructors and publishers will create variations on common styles. Proper citation is crucial to avoiding plagiarism and enables readers to locate and verify referenced material. See the Citing Generative AI section of this guide.

Copyright: Refers to the legal and exclusive rights granted to creators over their original works, including written content, images, audio, and other forms of creative expression. See also: USask Copyright Office and its Canadian Copyright and Indigenous Copyright Law resource.

Creative Commons License: A Creative Commons license is a public copyright license that enables creators to grant permission for others to use their work under specified conditions. These licenses allow creators to retain certain rights while giving others the freedom to copy, distribute, and build upon their work, often with attribution. There are several types of Creative Commons licenses, each with different combinations of permissions and restrictions to suit various sharing and usage preferences.

D

Data Privacy: Data privacy refers to the protection and proper handling of personal information in digital systems. It encompasses practices and policies that ensure sensitive data is collected, processed, stored, and shared securely and ethically, with the consent of individuals. In the context of generative AI, data privacy involves safeguarding the input data used to train models and protecting users' privacy when interacting with AI systems.

Dataset: A collection of organized, related data used for training, validating, or testing machine learning models. Datasets contain relevant information represented as tables, lists, or matrices and often include labelled examples to supervise the learning process. High-quality, diverse, and representative datasets are essential for developing accurate and unbiased AI models (see: Bias).

Datafication: Refers to the process of transforming various aspects of life (such as human activities, behaviours, and attributes) into quantifiable digital data that can be collected, stored, analyzed, and utilized for various purposes. Driven by the widespread use of digital technologies, sensors, and connected devices, datafication typically involves converting information into a digital format that can be processed and analyzed, raising important questions about privacy, ethics, and security. Appropriate legislation plays a crucial role in establishing safeguards and guidelines to ensure responsible data collection, usage, and protection. See this guide's Ethical Considerations: Privacy.

Deep Learning: A subfield of machine learning where neural networks with multiple layers (deep networks) learn from large amounts of data to perform complex tasks such as image classification, speech recognition, and natural language processing. Deep learning models can automatically discover and learn hierarchical representations from raw input data, improving accuracy and efficiency in various AI applications.

Deepfakes: Deepfakes are highly realistic videos or audio manipulated using artificial intelligence to create fake scenes or events, such as a video of a politician making statements they never actually made.

Detectors: AI detectors are tools or systems designed to identify content generated by artificial intelligence, mainly text produced by large language models. These detectors use various techniques, including statistical analysis and machine learning, to distinguish between human-written and AI-generated content. While AI detectors can be helpful in specific contexts, it's important to note that their accuracy is not perfect, and they may produce false positives or negatives.

Documentation: In the context of generative AI (GenAI) as a writing tool, documentation involves keeping track of GenAI software and version(s) used, saving and organizing conversation labels (e.g., "climate change"), prompts (e.g., "How does climate change affect biodiversity?), outputs for easy reference, and recording usage details (including day, month, and year. It may also include annotating outputs with notes and reflections. Investigate whether your GenAI tool has a function for shareable links (e.g., ChatGPT's). See also: Citation.

E

Ethics: In the context of artificial intelligence, ethics involves examining the potential impacts, risks, and benefits of AI technologies, ensuring responsible development and uses. Key ethical considerations in AI include fairness, accountability, transparency, privacy, safety, and avoiding harmful biases.

Ethical implications and considerations: In the context of education, ethical implications relate to the responsible integration of AI technologies and data-driven practices in teaching, learning, and research. Key considerations include maintaining academic integrity, ensuring fairness and transparency, protecting data privacy, and promoting responsible data management. As universities increasingly adopt these digital tools, addressing these ethical implications is essential to ensure an environment of trust and safeguard the well-being of students, faculty, and the academic community.

F

Fabricated content: False information created from scratch, such as fictional news articles or made-up statistics.

Fairness: In the context of artificial intelligence, fairness refers to the equitable treatment of individuals or groups without discrimination or bias. Fairness involves ensuring that AI systems do not systematically disadvantage specific demographics or perpetuate existing inequalities. Achieving fairness in AI is crucial to advancing social justice, inclusivity, and public trust in the technology.

G

Generative Adversarial Network (GAN): A GAN is a type of artificial intelligence framework involving two neural networks, typically referred to as the generator and the discriminator, which are trained simultaneously through a process of competition. The generator networks create new data that resemble the training data. It starts with random noise and transforms it into data, such as images, that mimic the real dataset. This discriminator evaluates the data it receives, distinguishing between real data (from the training set) and fake data (produced by the generator). In other words, it acts like a judge that decides whether a given instance is authentic or fake.

Generative Artificial Intelligence (GenAI): Unlike predictive and analytical models of artificial intelligence (AI), generative AI (GenAI) models focus on creating new data samples that resemble the original input data. These models learn patterns and features from the input data and generate unique, realistic outputs such as text, images, audio, or video. Generative AI models can be used in various ways, including language translation, text summarization, image synthesis, music composition, and the creation of realistic human images.

Genetic resources (GR): The Convention on Biological Diversity defined genetic resources (GRs) as "parts of biological materials that contain genetic information of value . . . [a]nd are capable of reproducing or being reproduced" (World Intellectual Property Organization, 2020 p.18). Plant materials, medicinal plants, and animal breeds are a few examples of GRs. When it comes to artificial intelligence and genetic resources, care must be taken to preserve data sovereignty, include Indigenous knowledge holders, and respect traditional knowledge (TK) and cultural rights (Kanjara, 2024).

Guideline: A guideline is a recommendation that directs actions within particular contexts. In the context of Generative AI, guidelines provide actionable steps and standards that ensure individual and unit actions align with the university's principles. See USask's Artificial Intelligence Provisional Principles and Guidelines (2024).

H

Hallucination: In the context of generative artificial intelligence (GenAI), "hallucinate" refers to "the generation of outputs that are not supported by the input data or deviate significantly from expected behaviour. This can occur due to overfitting, biases in the training data, or other issues that lead to erroneous predictions. In generative models like generative adversarial networks (GAN) or large language models, hallucinations can result in the creation of unrealistic or nonsensical content, such as synthesizing images of nonexistent objects or generating text with incorrect information.

I

Indigenous cultural sovereignty: This has been defined as the right of Indigenous Nations and peoples to "exercise their own norms and values instructing their collective futures . . . [it] is inherent in every sense of that word, and it is up to [them] to define, assert, protect, and insist upon that right" (Coffey & Tsosie, p. 196), and it is "generated from within tribal societies and [which] carries a cultural meaning consistent with those traditions [and] practices" (p. 197).

Indigenous data sovereignty: This can be defined as the "right of Indigenous peoples to control data from and about their communities and lands, articulating both individual and collective rights to data access and to privacy" (Raine et al. 2019, p. 300).

Input Data (see also: outputs): The information fed into an artificial intelligence system for processing or analysis. In generative AI, this includes the large datasets used to train models and the prompts, questions, or instructions provided by users during interactions. The quality and diversity of input data significantly influence the AI's performance and outputs.

Intellectual property (IP): See the USask Copyright Office's defOffice'sand examples of intellectual property (IP).

Intellectual property (IP) – Indigenous: Intellectual property (IP) is defined by the World Intellectual Property Organization as the "creations of the mind, such as inventions, designs, literary and artistic works, performances, plant varieties, and names, signs and symbols" (p.10). Indigenous legal scholar James (Sakej) Youngblood Henderson points out that Intellectual Property (IP) law is Eurocentric in that it "includes ownership and commercial privileges . . . [that] arise from the long, dark chronicle of the Eurocentric fictions of terra nullius (no one's territories gnaritas nullius (no one's knowledge and lex nullius (no laws) and the demesne (public domain) being applied to Indigenous peoples" (2021, p. 94). See also: the USask Copyright Office resource on Canadian Copyright and Indigenous Copyright Law.

Intelligent Tutoring System (ITS): ITSs are artificial intelligence-powered educational platforms that provide learners with personalized instruction, feedback, and guidance. They use machine learning algorithms and domain knowledge to adapt content and teaching strategies based on individual student needs. They can offer real-time assessments, identify learning gaps, and provide tailored interventions intended to improve educational outcomes, supplementing traditional teaching methods.

J

Under Construction

K

Under Construction

L

Large Language Model (LLM): A model of artificial intelligence that uses deep learning algorithms to process and generate human-like language. These models are characterized by their size (number of parameters) and their capacity to handle complex language tasks. They are trained on large datasets of text and can perform a wide range of language tasks such as translation, summarization, and text generation. Common examples of Large Language Models include the GPT series (GPT-4, GPT-3, etc.) developed by OpenAI; BERT, developed by Google; Claude, developed by Anthropic; and LLaMA, developed by Stanford University.

Learning Loss refers to missed learning opportunities or a decline in learning and skill development when generative AI technologies are misused or overused. Relying too heavily on AI-generated content can hinder the development of crucial critical thinking and problem-solving skills. Gen-AI should be used as a complementary aid rather than a replacement, with careful consideration given to its impact on active engagement in the learning process.

M

Machine Learning (ML): A subset of artificial intelligence where computer systems can learn from data, recognize patterns, and make predictions or decisions without being explicitly programmed. Machine learning algorithms improve as they process more data, enabling applications like image recognition, speech translation, and recommendation systems.

Manipulated content: Genuine information that has been altered in a way that distorts its original meaning or context, such as edited photos or selectively edited videos.

Misinformation: According to the Wikipedia article, misinformation is incorrect or misleading information that is often spread unintentionally. It differs from disinformation, which is deliberately deceptive and intentionally spread. In other words, misinformation is something you believe to be true, but in reality, it is not, whereas disinformation is something you know to be untrue, but you share it anyway.

Misleading content: Information that is technically true but presented in a way that misleads the reader, such as headlines that do not accurately reflect the content of an article.

N

Natural Language: Natural language refers to languages that artificial intelligence systems are designed to understand, interpret, and generate. This involves processing and analyzing spoken or written language, enabling AI to perform tasks such as translation, sentiment analysis, and conversational interactions similar to human communication.

Neural Network: A computational model inspired by the structure and functioning of the human brain, consisting of interconnected nodes or "neurons" or layers. Neural networks can learn from data, recognize patterns, and make predictions or decisions by adjusting the strength of connections between neurons, enabling deep learning applications such as image recognition and natural language processing.

O

Outputs: (see Input data): The results or responses produced by an artificial intelligence system after processing input data. In generative AI, outputs can take various forms, such as text, images, code, or other content created by the AI model in response to user prompts or queries. The nature and quality of outputs depend on the AI's training, the input provided, and the specific task or request given to the system.

Overfitting: A phenomenon in machine learning where an AI model learns to fit the training data too closely, capturing noise or irrelevant patterns, leading to poor generalization on new data.

P

Patchwork Plagiarism: Patchwork plagiarism, also known as mosaic plagiarism, occurs when writers combine material from multiple sources, moving around phrases and ideas without properly acknowledging the original author(s). This form of plagiarism can involve copying information from various writers and rearranging it without proper citation; using phrases, sentences, or ideas from different sources and weaving them together with one's own writing; paraphrasing or slightly modifying content while maintaining much of the original structure and wording. Examples.

Plagiarism (see also: Citation and Documentation): Refers to the presentation of generative AI (GenAI) outputs as one's original work without proper attribution or acknowledgment. This may include submitting outputs as part of an academic assignment, research paper, or creative work without disclosing its AI-generated origin. Although AI-generated text might not be copied directly from an existing source, the ethical implications and consequences of using it without transparency are the same as for traditional forms of plagiarism.

Predictive AI Models: These artificial intelligence (AI) models are designed to analyze historical and current data to make accurate predictions. By identifying patterns and trends, predictive AI models can forecast events, behaviours, or results in various fields such as finance, marketing, and healthcare. In other words, predictive models forecast what might happen based on past events.

Principle: A principle is a fundamental belief or assumption that underlies and guides behaviour or reasoning. Principles shape decision-making and institutional policy by reflecting core values and setting benchmarks for educational practice, research integrity, and administrative operations, fostering excellence, innovation, and inclusivity. In the context of GenAI, principles are essential for shaping the ethical use and integration of AI technologies in education and research. See UUSask'sArtificial Intelligence Provisional Principles and Guidelines (2024)

Privacy: In the context of artificial intelligence, "privacy" refers to the handling and protection of sensitive data used to train AI models, ensuring it remains secure and confidential.

Prompt Engineering: The systematic process of designing clear, contextually relevant, and actionable prompts or inputs for generative AI (GenAI) models. These prompts serve as cues or instructions that guide the GenAI models' behaviours, influencing the generation of outputs or responses. Formulating clear, effective, and unbiased prompts is crucial to obtaining the best possible outputs.

Public Domain (PD): Public Domain (PD) is a term broadly defined as "elements of IP [intellectual property] that are ineligible for private ownership and the contents of which any member of the public is legally entitled to use. It means something other than "publicly available," "such as information found on the internet, which can be "publicly available but not in the public domain from an IP perspective" (World Intellectual Property Organization, 2020, p. 10). When it comes to Indigenous knowledge (TK) and traditional cultural expressions (TKE), however, legal scholar James (Sakej) Youngblood Henderson notes, "Indigenous peoples, local communities and many countries reject a 'public domain status . . . , [arguing] that this opens them up to unwanted misappropriation and misuse" "p. 10).

Q

Under Construction

R

Reinforcement Learning: A type of machine learning where an agent, often a software program or algorithm, learns to make decisions by interacting with an environment and receiving feedback or rewards for its actions.

S

Self-Determination: Indigenous groups can define "self-determination" in several different ways. The United Nations Declaration on the Rights of Indigenous Peoples (UNDRIP). Article 3 states that "Indigenous peoples have the right of self-determination. By virtue of that right, they freely determine their political status and freely pursue their economic, social, and cultural development" "NDRIP, 2007). UNDRIP Article 4 states that "Indigenous peoples, in exercising their right to self-determination, have the right to autonomy or self-government in matters relating to their internal and local affairs, as well as ways and means for financing their autonomous functions."

"ocietal Bias: Refers to real-world, broad societal and cultural prejudices that can be reflected and perpetuated in artificial intelligence (AI) systems. For instance, societal biases surrounding gender roles or racial stereotypes can be unwittingly incorporated into AI applications, leading to discriminatory outputs.

Supervised Learning: Supervised learning is a machine learning technique where the model is trained on labelled data, with input-output pairs, to learn the mapping between inputs and outputs.

T

Traditional cultural expressions (TCE): TCEs are "intangible and tangible cultural manifestations of [Traditional Knowledge] . . . passed down generations, such as oral storytelling, folk dances, symbols, artwork, hunting, and medicine" (Guilbault et al. 2022, p.7). TCEs "facilitate the social and cultural identity formation of Indigenous people[s] "(Guilbault et al., 2022, p. 7). Furthermore, TCEs are not static; rather, they are "constantly evolving, developing and being recreated" (World Intellectual Property Organization, 2020, p.17).

Traditional knowledge (TK): Widely understood as "the ever-evolving body of intellectual knowledge cultivated by [Indigenous Peoples]," Traditional Knowledge (TK) often involves the cultural skills, know-how, and other innovations developed over generations (World Intellectual Property Organization, p.10). Note that Traditional cultural expressions (TCEs) and genetic resources (GR) are related to TK (Guilbault et al., p. 7).

Training Data: The dataset used to develop, teach, and refine an artificial intelligence (AI) model or algorithm. Training data is a crucial component of machine learning, as it enables AI systems to learn from examples and recognize patterns, ultimately improving their performance and accuracy. The quality, diversity, and quantity of training data significantly impact the effectiveness of AI models. In the context of Generative AI (GenAI), training data typically consists of vast collections of images, text, audio, or other forms of media that the AI uses to generate new, synthetic content. Proper citation and usage of training data are essential to respect intellectual property rights and maintain ethical practices in AI development.

Transparency: Using Generative AI transparently means that you make clear what parts of your work are human-generated and what parts are AI-assisted or AI-generated. It also refers to being open and clear about the data sources, algorithms, and processes involved in training and operating AI models. Transparency is crucial for understanding the capabilities, limitations, and potential biases of GenAI systems. It helps users make informed decisions about the reliability and trustworthiness of the AI-generated content and enables them to identify and address any ethical or legal concerns related to the technology's application. See also in this guide: Citing GenAI.

U

Underfitting: A phenomenon in machine learning where a model is too simple to capture the underlying structure of the data, resulting in poor performance on both training data and new data.

V

Validation: The process of evaluating the performance of a machine learning model on unseen data to ensure that the model generalizes well to new, independent datasets and to prevent overfitting by assessing its accuracy, precision, recall, and other relevant metrics.

W

Under Construction

X

Under Construction

Y

Under Construction

Z

Under Construction

Guide Disclaimer