The Global Security Risks of Open-Source AI Models

Tags
Cyber Security
Cyber Security
Cyber Security
AI Regulation
AI Regulation
AI Regulation
Open Source AI
Open Source AI
Open Source AI
Open-source AI models, when used by malicious actors, may pose serious threats to international peace, security and human rights. Highly capable open-source models could be repurposed by actors with malicious intent to perpetuate crime, cause harm, or even disrupt and undermine democratic processes. In recent years, deepfakes[1] have been used to generate political and public reactions, influence election processes, spread misinformation and propaganda, and aggravate tensions in conflict prone regions[2],[3],[4]. The damage of deepfakes raise serious questions about the malicious use of AI as a threat to security and democracy at a national and global scale.  
This article discusses the challenges and risks open-source AI models designed for civilian use could pose for international peace and security. It considers the various ways these models could be exploited by malicious actors and what that could mean for modern warfare, cyberattacks, violent extremism, terrorism and global security.
Much ado about open-source AI Models
What are open-source AI models and why are they important? Open-source AI models are artificial intelligence systems whose source code, underlying algorithms, and/or weights[5] are publicly accessible and downloadable, thereby allowing anyone to use, modify, and distribute them[6].
The concept of open-source is designed to promote transparency, knowledge sharing, community-driven development and collaboration, and continues a long tradition of open-source approaches in computer science. At their core, open-source initiatives, much like open access initiatives, have positively impacted the pace of innovation by democratising access to high-level intellectual resources and knowledge. As with open-source projects, open-source AI models have a community that regularly contributes to improve the quality of the models, develop use cases, run tests, detect and fix bugs etc. Open-source AI models span various AI technologies such as:
  • Large Language Models (LLMs): These are AI models designed for natural language processing tasks like text generation, translation, and summarization such as Meta’s Llama 3[7].
  • Computer Vision Models: These are models used for image and video analysis, facial recognition and object detection such as YOLO (You Only Look Once)[8].
  • Speech Recognition Models: These are models that convert spoken words or languages into text such as Open AI’s Whisper[9] and Mozilla’s DeepSpeech[10].
  • Generative Adversarial Networks (GANs): These are models used to generate realistic images, videos, or other types of data. GANs is the primary technique used for creating deepfakes, such as with Nvidia’s StyleGAN and StyleGAN2[11].
  • Robotics Models: These are models, techniques, or frameworks used to develop, test, control and simulate robots or robotic applications such as Willow Garage’s ROS (Robot Operating System)[12].
Open-source AI models are important and beneficial for many reasons such as their cost-effectiveness, as they are typically free and built for the wider AI community to advance research and knowledge sharing. They are also customizable and transparent to varying degrees, as their codebase and dataset are auditable, and their weights can be finetuned, which fosters trust; they are interoperable, since they adhere to open standards and are Operating System (OS) agnostic. Open-source AI models are community-driven and enkindle innovation and collaboration, as developers can quickly contribute to improve the quality and uniqueness of the models. Simply put, open-source AI models play an invaluable part in the public phasing AI wave we are experiencing.
Security Implications of Open-Source AI Models
Despite its many benefits, the accessibility of open-source AI models poses risks for international peace and security. These models and tools could be leveraged by State and non-state actors for espionage, surveillance, cyberwarfare, disinformation and propaganda, and weapons development. This poses geopolitical risks and has national security implications, as critical infrastructure could be attacked and terrorist groups could drive recruitment and personalise propaganda campaigns, while bypassing or subverting detection and moderation. Below I discuss some of the security implications of open-source AI models and identify real-world examples of how they are currently being exploited by malicious actors, and in turn frustrating efforts in response, including counter-terror.


  • Deepfakes & Disinformation
Deepfakes make it incredibly easy to disseminate propaganda; it puts misinformation and disinformation on steroids. Often closely associated with disinformation, deepfakes are frequently used to manipulate public perception, undermining trust in authentic information. Additionally, deepfakes could be used for harassment or generating biased content that perpetuates discrimination, thus violating rights to privacy, dignity, and equality.
Open-source AI models, particularly generative AI models, are vulnerable to being repurposed for malicious activities that might disrupt public safety, influence election outcomes, increase the risks of cyberattacks, and aid terrorist and violent extremist (TVE) groups in evading automated detection systems such as hash-sharing [13]. Systematic disinformation by non-state or state backed actors could stir up and foment civil disorder and breach public peace. Recently, there have been confirmed reports of extremist and terrorist actors using publicly available AI tools and models to enhance the reach of their operations. In 2023, a pro-Islamic State (ISIL/Da’esh) tech support group produced a guide for its members on how to enhance and protect their identity and privacy by using generative AI tools[14], and using speech recognition models to transcribe the messages of IS leaders. Several other cases abound; Tech Against Terrorism’s Open-Source Intelligence (OSINT) operations has so far recorded and archived over “5,000 pieces of AI-generated content produced by terrorist and violent extremist actors”, though not necessarily through the use of open-source models[15]. On 21 July 2023, the team identified a dedicated channel on a messaging app that promulgated neo-Nazi, racist, and antisemitic AI generated images.
Terrorist and violent extremist actors may use publicly available AI models to generate harmful content that is hard to detect, track and stop. For example, the proliferation of deepfakes renders hash-sharing as a technique to identify and combat the spread of harmful digital content across platforms less effective[16]. Terrorist and extremist groups exploit generative adversarial networks (GANs) to produce highly realistic human faces, creating fake personas that can hinder the effectiveness of biometric tracking methods, including facial recognition, used by counterterrorism and national security agencies.


  • Cyberattacks & Cyberwarfare
With generative AI models, three types of cyberattacks have become all too common. These are AI-powered phishing attacks, AI-driven social engineering attacks, and ransomware attacks.
A recent warning from the FBI indicates that open-source models have attracted cybercriminals and bad-faith actors who use such models to develop malware, and phishing attacks[17]. Malicious actors could exploit security gaps in government facilities by using generative AI models to create elaborate social engineering campaigns. These campaigns might involve developing deepfake personas and crafting personalised messages to lure victims into revealing top-secret documents or in the case of phishing, steal their login credentials. This poses a significant and alarming threat that requires immediate attention[18].


  • Weapons Development & Automated Attacks
In May 2024, Google DeepMind researchers along with Isomorphic Labs released AlphaFold-3, the third iteration of its AI program designed to predict the structure and interaction of proteins, DNA, RNA, and ligands[19]. Google has promised to release the full source code for AlphaFold-3 by the end of 2024, making it fully open access. While this is a commendable development for scientific advancement and peaceful uses of technology, access to such a specialised tool in the hands of a very resourceful malicious actor could present some dangers, as Google DeepMind has acknowledged in its own public risk assessment. Additionally, current LLMs are capable of expertly summarizing academic resources, potentially offering access to the tacit information that could be used to unintentionally help develop harmful chemical or biological agents. Other open-source models, such as robotics models, could be used to develop and test uncrewed or self-driving vehicles, as well as drones equipped with bombs[20], and potentially be used at scale, putting global security at risk.
Regulatory Challenges
The distributed, intangible, and cross border nature of open-source contributions can make it extremely difficult to implement any form of centralised governance, putting national security interests at odds with the traditionally community centric, transparent model characteristic of open-source projects. Some of the impediments to regulating open-source AI models include:
  • Lack of accountability: Open-source AI models are often developed and maintained by decentralised communities, making it difficult for regulators to assign accountability and liability when they are repurposed for uses that impact national or international security.
  • Global reach and accessibility: Open-source AI models are accessible worldwide to a variety of audiences, creating difficulty in implementing jurisdictional regulations and thus making them prone to exploitation by malicious actors, violent extremist and terrorist groups.
  • Security and misuse: Regulating and mitigating malicious use of open-source AI models for creating deepfakes, automated cyberattacks, weapons development is challenging because the codebase is freely available, making them downloadable and customisable by anyone with the knowledge of how to use them.
  • Data privacy & compliance: Open-source AI models can be trained on datasets that may include personal or sensitive information, leading to potential privacy violations, especially given the difficulties for ensuring compliance with data protection regulations by such actors.
  • Regulatory lag: As with most technological innovation, regulation still plays a catch-up game. The rapid pace of development of open-source AI models outpaces the creation and implementation of relevant regulations, putting regulators in a rather reactive position.
Mitigation Strategies
Despite the challenges regulators face in governing open-source AI models, several mitigation strategies exist; however, their effectiveness is uncertain due to the potential difficulties in enforcing compliance.
  • Security-First Approaches: Developers of open-source AI models could adopt a security-first architecture. This means embedding security considerations into the development and release of open-source AI models with safety protocols, fail-safes, and built-in safeguards that make malicious use of these systems impractical – such as preventing the models from generating harmful content. Actualising this would require adversarial testing, that is deliberately challenging the models to identify weaknesses; ethical hacking, where experts attempt to exploit vulnerabilities in the models; and red teaming, where experts simulate real-world threats in order to assess and stress-test the model’s defences.
  • Collaboration and Governance: Policy makers, entities focused on international security and counter-terrorism teams, along with open-source AI developers could partner with security organisations such as Open Source Intelligence (OSINT) to develop governance frameworks for open-source AI. Additional governance mechanisms like algorithmic audits, model licensing, and security certifications are important considerations to have.
  • AI Transparency and Accountability: Open-source AI models already operate under open-source protocol since their codebase or weights are auditable and open to contributors. However, given the propensity for abuse and its potential impact on international and national security, it is important for developers to put mechanisms in place that prevent abuse – such as version control, logging tools to track changes to the codebase or weights, digital signatures etc.
  • Responsibility of Developers: Developers of open-source AI models have an ethical responsibility to guard against potential for misuse. This could mean that full access to certain models at risk of abuse by malicious actors are restricted or kept closed.
Conclusion
Open-source AI models are a dual-edged sword requiring a delicate balance between fostering innovation and managing international security. Given the complexity of governing these models, there needs to be ongoing dialogue between developers, policymakers, and the international peace and security community.


Author: Samuel Segun, Senior Researcher GCG

——————————————————————————————————————————

[1] “Deepfake” refers to any deceptively manipulated image, video or audio made using deep learning to appear real.
[2] Bond, S. (2023, October 10). Video game clips and old videos are flooding social media about Israel and Gaza. NPR.https://www.npr.org/2023/10/10/1204755129/video-game-clips-and-old-videos-are-flooding-social-media-about-israel-and-gaza
[3] Jones, S., & Dubberley, S. (2023, October 11). Real or fake? Verifying video evidence in Israel and Palestine. Human Rights Watch.https://www.hrw.org/news/2023/10/11/real-or-fake-verifying-video-evidence-israel-and-palestine
[4] Klepper, D. (2023, November 28). Fake babies, real horror: Deepfakes from the Gaza war increase fears about AI’s power to mislead. Associated Press.https://apnews.com/article/artificial-intelligence-hamas-israel-misinformation-ai-gaza-a1bb303b637ffbbb9cbc3aa1e000db47
[5] Model weights are simply the memory of an AI model, represented as numerical values that the model learns from training data, which essentially allows the models to understand patterns in the data.
[6] Tozzi, C. (2024, February 7). The importance and limitations of open source AI models. Tech Target. https://www.techtarget.com/searchEnterpriseAI/tip/The-importance-and-limitations-of-open-source-AI-models#:~:text=Similarly%2C%20an%20open%20source%20AI%20model%20is%20one,of%20transparency%20and%20customizability%20that%20closed-source%20models%20lack
[7] Other examples of Open Source LLMs include: OpenAI’s GPT-3 (Generative Pre-trained Transformer 3) and GPT-4 (Generative Pre-trained Transformer 3), Google’s BERT (Bidirectional Encoder Representations from Transformers), and T5 (Text-to-text-transfer-transformer), and
[8] Other open source computer vision models include — OpenCV (Open Computer Vision), and Faster R-CNN (Faster Region-Convolutional Neural Network).
[9] Seagraves, A. (2024, June 13). Benchmarking top open source speech recognition models: Whisper, Facebook wav2vec2, and Kaldi. Deepgram.https://deepgram.com/learn/benchmarking-top-open-source-speech-models
[10] Others include Meta’s Wav2Vec2.0, Kaldi, ESPnet (End-to-End Speech Processing Toolkit), and Google’s Speech-to-Text.
[11] Others include CycleGAN, DCGAN (Deep Convolutional GAN), BigGAN, StarGAN, Artbreeder.
[12] Some examples include: Willow Garage’s ROS (Robot Operating System), OpenAI’s Baselines, Gazebo, OpenRAVE (Open Robotics Automation Virtual Environment).
[13] Hash sharing involves using unique identifiers or digital fingerprints called hash to track content across platforms by sharing this digital fingerprints with GIFCT’s member companies helping them identify activities of terrorist  and violent extremist actors.
[14] Criezis, M. (2024, February 5). AI Caliphate: The Creation of Pro-Islamic State Propaganda Using Generative AI. Global Network on Extremism & Technology. https://gnet-research.org/2024/02/05/ai-caliphate-pro-islamic-state-propaganda-and-generative-ai/
[15] Tech Against Terrorism. (2023, November). Early terrorist experimentation with generative artificial intelligence services. [Report]. https://techagainstterrorism.org/hubfs/Tech%20Against%20Terrorism%20Briefing%20-%20Early%20terrorist%20experimentation%20with%20generative%20artificial%20intelligence%20services.pdf
[16] Global Internet Forum to Counter Terrorism (GIFCT). (n.d.). GIFCT’s Hash-Sharing Database. https://gifct.org/hsdb/
[17] Kan, M. (2023, July 28). FBI: Hackers are having a field day with open-source AI programs. PC Magazine. https://www.pcmag.com/news/fbi-hackers-are-having-a-field-day-with-open-source-ai-programs
[18] Stanham, L. (2024, May 31). AI-powered cyberattacks. CrowdStrike. https://www.crowdstrike.com/cybersecurity-101/cyberattacks/ai-powered-cyberattacks/
[19] Google DeepMind. (2024, May 8). AlphaFold 3 predicts the structure and interactions of all of life’s molecules. Google Blog.https://blog.google/technology/ai/google-deepmind-isomorphic-alphafold-3-ai-model/#life-molecules
[20] Nelu, C. (2024, June 10). Exploitation of generative AI by terrorist groups. ICCT, https://www.icct.nl/publication/exploitation-generative-ai-terrorist-groups

Sign up to our newsletter

Stay updated with the latest news and exciting updates from the center!

By submitting your details you are giving us permission to share your details with our organisation. We may also reach out to you with email content or via our newsletter. For more information please see our website terms & conditions.

By submitting your details you are giving us permission to share your details with our organisation. We may also reach out to you with email content or via our newsletter. For more information please see our website terms & conditions.

We're advancing local insights to create global impact on equitable AI governance through knowledge production and exchange.

© Global Center on AI Governance copyright 2024

We're advancing local insights to create global impact on equitable AI governance through knowledge production and exchange.

© Global Center on AI Governance copyright 2024

We're advancing local insights to create global impact on equitable AI governance through knowledge production and exchange.

© Global Center on AI Governance copyright 2024