ActiveFence Trust & Safety Glossary | Key Terms and Definitions

A

Accuracy

Refers to the degree to which automated or manual moderation tools make the correct decisions. Sometimes measured as an inverse of the false-positive rate. Read more.

Adversarial Behavior

General term for unwanted on-platform behavior

At Scale

A reference to the volume of violative content and the flexibility of the system with the capacity to properly moderate the content.

B

Ban evasion

The act of bypassing platform moderation actions or circumventing a platform ban, often involves the creation of at least one additional account.

By creating new accounts, threat actors can return to a platform whose policies they have violated, in an effort to continue to do harm.

Banning

Permanently removing or blocking a user from the platform.

Read more: Policy Enforcement

C

CP

Popular term among predator communities to denote “child pornography”

Child Safety

In Trust & Safety, refers to online risks to children, including exposure to harmful content, abuse, harassment, CSAM, and exploitation.

Child Sexual Abuse Material

Widely referred to as CSAM.

Images, text, or videos depicting the sexual abuse of minors (under 18 years old). For many platforms, this includes individuals who appear to be minors.

Clear Web

Websites that are publicly accessible to all audiences, through standard browsers and search engines. Estimated to only host 5% of online content.

Combating the Financing of Terrorism

A set of laws and regulations that require financial institutions and financial technology firms to assist law enforcement in their attempt to block terrorist entities from accessing funds.

Comment Moderation

The process of detecting, flagging, removing, or denying abusive users the ability to post harmful comments.

Read more: Content Detection Tools

Community Moderation

Moderation activities that are conducted by community members, and not professional moderation teams.

Content Moderation

The internal process of screening user-generated content posted to online platforms, in order to determine whether or not it violates policy, and take appropriate action against violative content.

Content Policy Manager

Team member involved in the dynamic process of creating and maintaining the community guidelines of an online platform.

Cyberbullying

The use of online platforms to bully individuals, generally refers to the abuse of children.

D

Dark Web

The part of the online world that is only accessible via dedicated dark web browsers which use “onion routing” technology to allow completely anonymous browsing. While the dark web isn’t necessarily illegal or illicit, the anonymity it provides allows for these activities to take place.

Decision Tree

A tree-shaped model (or flow chart) of questions or decisions, where each outcome determines the next question or decision to be made, in order to come to an eventual outcome.

In Trust & Safety, a decision tree is often created to streamline and standardize the moderation decision process. Policy teams will create a decision tree for moderators to use when making a decision about a specific item o... Read More

Disinformation

Intentionally misleading information that is shared and broadly distributed (disseminated) with the purpose of misleading or deceiving an audience. Often used as propaganda, disinformation has been widely used to sow public mistrust, influence elections, and legitimize wars. This is generally an organized, orchestra... Read More

Distributed Moderation

A type of content moderation where no individual person makes the moderation decision, rather community members (individual users) vote to determine if an item should or should not be allowed on the platform or forum.

Duty of Care

In the UK’s drafted Online Safety Bill, online platforms have a Duty of Care to assess risks to their users, put policies and procedures to minimize that risk, and take actions to keep users safe.

E

Enforcement

The broad range of actions taken by Trust & Safety teams when content violates policy.

Read more: Policy Enforcement

Error Rate

The rate at which items or moderation events are incorrectly identified. This is the inverse of accuracy.

F

Fraud

Defined as criminal deception intended to result in financial or personal gain.

In Trust & Safety, this may also refer to deceiving a user into providing their personally identifiable information (PII) or unknowingly providing access to their devices or accounts.

G

Grooming

The act of preparing or manipulating a minor into sexual victimization. Generally involves prolonged online communication, which may begin as non-sexual, and gradually escalates into sexually suggestive communication, before eventually leading to sexually offensive activities which may include physical contact.

H

Harmful Content

Any text, image, audio, video, or other content posted online that is considered violative, malicious, deceptive, illegal, offensive, or slanderous.

Hash Database

In Trust & Safety, organizations such as NCMEC aggregate databases of image hashes that are related to various offenses (in this case, child safety). Platforms can then compare image hashes from their content to hashes of known malicious content (such as CSAM). This way, moderators do not have to view and analyze a potentially harmful piece of content and can compare its hash to that of recognized images.

Read more: Content Detection Tools

Hashing

Technology that creates a unique, fixed-length string of letters and numbers to represent a piece of data (often an image). The hash is non-reversible, meaning an image can’t be recreated from its hash.

Read more: Content Detection Tools

Hate Speech

Any expression or online content that incites, discriminates, justifies hatred, or promotes violence against an individual or group. Given distinct cultural and linguistic nuances, the detection of hate speech is often a complex task requiring regional and linguistic expertise.

Human Intelligence

The collection of intelligence by means of interpersonal contact.

In Trust & Safety, specialized teams use human intelligence to infiltrate threat actor communities and proactively identify their means and methods.

I

Impersonation

Apps or websites that are intentionally created to resemble existing apps or services or appear to be a part of the user interface in order to gain access to personal data, passwords, bank accounts, etc.

Impersonation Of Individuals

The creation of fake accounts, often using a target’s name or photo, in order to cause harm to that individual.

Intelligence

In trust & safety, intelligence is used to proactively alert platforms about impending risks, and to inform better moderation decisions.

Intelligence Desk

Internal or vendor teams that are responsible for on- and off-platform intelligence collecting in support of the Trust & Safety team’s efforts. Utilizing a broad range of tactics, including OSINT, WEBINT, and HUMINT, the team detects new threats and trends, identifies tactics, techniques, and procedures (TTPs), conducts investigations into suspicious account behaviors, “red teams” platform policies, and more.

K

Keyword Moderation

A form of content moderation that flags the instance of specific, potentially violative keywords used in text, audio, images, or videos posted on the platform. This type of moderation is limited in that it often lacks a contextual understanding of the keyword’s use, and requires constant updating of new violative keywords.

Know Your Customer

A component of a financial institution’s anti-money laundering policy, Know Your Customer (KYC) is a requirement for financial institutions and certain financial technology firms to verify the identity of a client to prevent illegal access to funds (for example – funding of terrorist activities).

L

Loli

Child predator slang for an underage female with a childlike appearance, or a woman who is of age but physically looks, or dresses like a minor.

M

Malicious Content

Content that is created or shared with malicious intent. Includes but is not limited to child sexual abuse material, nudity, profanity, sexual content, bullying, terrorist content, violence, and disinformation.

Misinformation

The unintentional creation or sharing of inaccurate or misleading information. This differs from disinformation in that misinformation is unintentional, while disinformation is the intentional distribution of misleading information.

Muting

Platforms can allow users to mute other users so that their activities do not appear on their feeds.

Read more: Policy Enforcement

N

Non Consensual Intimate Imagery

Sexually-explicit images that were either acquired unknowingly or unlawfully or were taken consensually but shared or posted online without consent.
NCII also include the sharing of intimate imagery beyond the scope of its intended use (ie. leaking intimate images shared on one platform, across other platforms)

O

Online Safety Bill (UK)

The UK’s Online Safety Bill is upcoming legislation that will require online platforms to take proactive action to keep users safe. The Bill outlines illegal, and some legal but harmful content that platforms will have to act against. The Bill’s current draft is in Parliament and is expected to pass by the end of 2022.

Online Sexual Harassment

Sexual harassment or misconduct that is conducted online. This is disproportionately aimed at women and/or members of the LGBTQIA+ community.

Open Source Intelligence

Also knowns as web intelligence (WEBINT). Intelligence that is collected via publicly available information online.
In Trust & Safety, open-source intelligence is used to gain a contextual understanding of harmful activities, enabling proactive content moderation.

P

Policy

The content policy of a user-generated content website defines what can and can’t be posted to that specific website. Also known as Content Policy or Community Guidelines.

Policy Analyst

Member of the policy team, responsible for analyzing and examining the effectiveness of a company’s content policies.

Policy Enforcement

The team responsible for enforcing the platform’s content policies and taking action against violative content.

Policy Specialist/Policy Manager

Individuals in the Trust & Safety team responsible for establishing the platform’s community guidelines and defining what is and isn’t allowed to be shared on the platform.

Involves collaboration with internal teams as well as external agencies such as law enforcement, regulators, and industry partners.

Proactive Moderation

A form of content moderation that aims to detect malicious or violative content before it is seen or reported by others. Utilizes various techniques, including automated detection and intelligence collecting to identify the violative content before it has a chance to harm user safety or platform integrity.

Previously used only to detect illegal content like CSAM and terror, pending legislation may require platforms to proactively moderate harmful, not just illegal content.

See Harmful Content Detection

Proactive Removal Rate

A metric that indicates the rate at which action was taken on content or accounts prior to being posted or reported by other users.

Calculated by the number of proactively moderated items divided by the total number of moderated items.

R

Reactive Moderation

A moderation process that relies on a platform’s community or other individuals to identify and flag content that may be in breach of a platform’s policies. Due to its reliance on the community, violative content is often seen by multiple users, before action is taken.

Recall

The measure of how much of a platform’s malicious content is picked up by its moderation systems.

Calculated by the number of correctly identified malicious items, divided by the total number of malicious items on the platform. For example, if a platform has ten malicious pieces of content, and AI identified seven of them, that AI had a 70% recall rate. For most automated detection mechanisms, recall and precision are inversely correlated.

Reverse Engineering

A method or process where attempts are made to replicate a system, process, machine, device, or software. Used in cybersecurity and red teams to find malicious code in software, websites, and apps.

S

Suspension

An enforcement mechanism that involves the temporary, time-limited banning of an account.

Read more: Policy Enforcement

T

Tactics, Techniques, and Procedures (TTPs)

Generally provided by intelligence providers or internal intelligence desks, TTPs are a description of the techniques used by bad actors to conduct harm. In Trust & Safety, an understanding of bad actor TTPs can provide teams with the insights needed to proactively detect and stop on-platform harm.

Learn more: Trust & Safety Intelligence

Thinspo

Prevalent in eating disorder communities, the term references images or other content that encourages one to engage in extreme dieting behaviors. Also known as “thinspiration”, “bonespo”, “fitspo”.

Learn more: Eating Disorder Communities

True Positive

Content that is correctly flagged as violative

Trust & Safety

Teams that are focused on the development, management, and enforcement of a platform’s policies to ensure that only acceptable content is posted online and that users are safe.

Trusted Flagger

An individual or vendor who is considered by the platform to be an expert in their field. Content flagged by this entity is therefore given special notice by moderation teams.

Read more: Content Detection Tools

Oops!

Sorry, there were no results for that search term.

The Trust & Safety Glossary

A

Accuracy

Adversarial Behavior

At Scale

B

Ban evasion

Banning

C

CP

Child Safety

Child Sexual Abuse Material

Clear Web

Combating the Financing of Terrorism

Comment Moderation

Community Moderation

Content Moderation

Content Policy Manager

Cyberbullying

D

Dark Web

Decision Tree

Disinformation

Distributed Moderation

Duty of Care

E

Enforcement

Error Rate

F

Fraud

G

Grooming

H

Harmful Content

Hash Database

Hashing

Hate Speech

Human Intelligence

I

Impersonation

Impersonation Of Individuals

Intelligence

Intelligence Desk

K

Keyword Moderation

Know Your Customer

L

Loli

M

Malicious Content

Misinformation

Muting

N

Non Consensual Intimate Imagery

O

Online Safety Bill (UK)

Online Sexual Harassment

Open Source Intelligence

P

Policy

Policy Analyst

Policy Enforcement

Policy Specialist/Policy Manager

Proactive Moderation

Proactive Removal Rate

R

Reactive Moderation

Recall

Reverse Engineering

S

Suspension

T

Tactics, Techniques, and Procedures (TTPs)

Thinspo

True Positive

Trust & Safety

Trusted Flagger

Oops!