Manage and orchestrate the entire Trust & Safety operation in one place - no coding required.
Take fast action on abuse. Our AI models contextually detect 14+ abuse areas - with unparalleled accuracy.
Watch our on-demand demo and see how ActiveOS and ActiveScore power Trust & Safety at scale.
The threat landscape is dynamic. Harness an intelligence-based approach to tackle the evolving risks to users on the web.
Don't wait for users to see abuse. Proactively detect it.
Prevent high-risk actors from striking again.
For a deep understanding of abuse
To catch the risks as they emerge
Disrupt the economy of abuse.
Mimic the bad actors - to stop them.
Online abuse has countless forms. Understand the types of risks Trust & Safety teams must keep users safe from on-platform.
Protect your most vulnerable users with a comprehensive set of child safety tools and services.
Stop online toxic & malicious activity in real time to keep your video streams and users safe from harm.
The world expects responsible use of AI. Implement adequate safeguards to your foundation model or AI application.
Implement the right AI-guardrails for your unique business needs, mitigate safety, privacy and security risks and stay in control of your data.
Our out-of-the-box solutions support platform transparency and compliance.
Keep up with T&S laws, from the Online Safety Bill to the Online Safety Act.
Over 70 elections will take place in 2024: don't let your platform be abused to harm election integrity.
Protect your brand integrity before the damage is done.
From privacy risks, to credential theft and malware, the cyber threats to users are continuously evolving.
Here's what you need to know.
Find out more about how to reduce the risks of GenAI with our red-teaming report
Society is rapidly embracing generative AI technology, but our understanding of its implications lags way behind. While we have felt the impact of social media on our democracy through the spread of misinformation, platforms have only recently started tackling this issue.
Unfortunately, our resilience to online misinformation is still in its infancy, and generative AI (GenAI) tools are poised to escalate the problem by accelerating the spread of false information.
This is a bleak prospect, especially in a year with significant global events—like crucial elections and the 2024 Summer Olympics.
GenAI tools, like large language models (LLMs), are trained on vast amounts of human language data and excel at processing and analyzing information quickly.
But LLMs have trouble fully understanding the meaning, context, and risks of online conversations. Specific terminologies or euphemisms may not be detected as malicious. In other cases, benign content in one context can gain a different, harmful meaning in another.
That’s a lot of nuance for a machine—that cannot capture the entirety of human language or keep up with current events—to unravel. Even with real-time internet access, AI struggles to discern truth from fiction.
Thus far, these tools only reflect and amplify human flaws. They can also make errors at alarming rates, posing a significant threat to the fight against disinformation. For example, models may trust a user’s knowledge on a specific topic, even if it is incorrect, and, based on that assumption, comply with requests to generate false content. AI’s compliance with user requests becomes doubly problematic when harmful requests go undetected.
Traditionally, disinformation actors had to create and test false narratives to see which gained traction. Now, GenAI automates this process, spreading misinformation faster and making it more convincing. To combat this, moderation efforts for AI models must be proactive and adapt quickly to our ever-changing world.
Our ongoing research on various LLMs suggests they can reproduce and spread misinformation narratives from various regions and political systems.
Recent reports have uncovered influence operations using AI tools to spread misinformation in multiple languages, creating full-length articles, social media posts, and replies to increase fake engagement. Safety measures within models are weaker for non-English languages and less-covered topics due to fewer resources dedicated to these issues.
ActiveFence’s testing before the 2024 global elections found that industry-leading GenAI tools supported common misinformation narratives. In the US, these tools spread conspiracies about Nikki Haley’s birthplace and suggested Taylor Swift was part of a psychological operation to influence the election in favor of Joe Biden.
Outside the US, models supported claims that Ukrainian refugees receive preferential treatment over EU citizens and spread false claims of voting machine manipulation by the Bharatiya Janata Party (BJP) in Indian elections. Disturbingly, the models not only provided unsafe responses but also created original misinformation, including false biographies, slogans, and hashtags.
Disinformation is also a major concern for the Paris Olympics happening this summer. The Olympic Games often cause fear and concern for host countries, which worry about cyberattacks, violence, and misinformation campaigns that could harm their reputations.
Already, Russia and Iran have launched their disinformation campaigns with fake websites made to look like real news sites. One site linked to Russian disinformation actors claimed, “Illegal migrants are threatening to disrupt the Olympic Games in Paris. The authorities are powerless. Macron’s festival of vanity will pass, the problems for the French will remain.”
While common sense often helps distinguish fact from fiction, provocative statements like these, spread on various platforms, can mix with real news, rumors, and opinions online, fooling LLMs and causing the misinformation to spread. As AI tools integrate with messaging apps, social media, and search engines, they provide new avenues for bad actors to spread misinformation.
During crises, natural disasters, or political upheaval, AI-integrated platforms can become primary sources for many seeking clarity and understanding. If AI safety measures fail in such situations, the consequences can be devastating.
AI is now integral to the modern world, and LLMs will continue to grow and integrate into our daily lives. However, their ability to respond to important events and current issues remains unpredictable.
To remain ahead of these vulnerabilities, generative AI tools need comprehensive and proactive testing and moderation. This goes beyond implementing robust safeguards—it requires a true human touch to seal the gaps where machines fall short.
As we approach major global events like the Paris 2024 Olympics and face the increasing threat of misinformation, it’s crucial for AI models to take preemptive action to ensure our information resilience.
Discover how scammers exploit GenAI for impersonation scams, from deepfakes to deceptive identities, and learn how to stay protected.
57 elections are set to take place in 2024, a historic event that will test Trust & Safety teams' election readiness. This blog highlights 10 of the most concerning elections, and what Trust & Safety teams can do to prepare for them.
Over the past year, we’ve learned a lot about GenAI risks, including bad actor tactics, foundation model loopholes, and how their convergence allows harmful content creation and distribution - at scale. Here are the top GenAI risks we are concerned with in 2024.