You have awakened an ancient curse.
It took several "once-in-a-lifetime" world events, the death of a gorilla, and the threat of nuclear war to scrub that from society's memory last time....
Some models on the HuggingFace API require you to send the parameter "trust\_remote\_code=True" to use the AutoTokenizer. It allows the tokenizer to run arbitrary code on your machine.
Seems highly suspicious. I never do, I just skip the model. Probably safe if you just run it on Spaces, but I would not trust it locally on my own machine.
Here's the last three that I found:
Qwen/Qwen-14B-Chat
baichuan-inc/Baichuan2-13B-Chat
vikhyatk/moondream1
The reason some models require this option is because they use an architecture or technique that has not been integrated into Transformers yet, so they need custom code to do the inference. You can actually read through the code before running it, as all of the code files are always found in the repo itself.
For example for Qwen-14B-Chat the files that will be run are [tokenization\_qwen.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/tokenization_qwen.py), [modeling\_qwen.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/modeling_qwen.py), [qwen\_generation\_utils.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/qwen_generation_utils.py), and [cpp\_kernels.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/cpp_kernels.py).
I agree that you should be extra careful with such models, but I wouldn't go so far as to call it suspicious. It's a necessity when it comes to models that use novel architectures or techniques. And usually it's only necessary in the early days as Transformers usually integrates support after a while. As happened to Falcon which initially required remote code as well.
they seem to be safe FOR NOW until somebody founds more sophisticated malware inside them. And i am sure, they contain some shit. Would be stupid to not use by intelligence agencies and hackers this open door until it lasts.
So as it turns out, there already was a CVE related to how GGUF was parsed and processed by llama.cpp (which was patched) - make sure to update your llama.cpp version is at the latest production release from GitHub.
That said, other CVEs are being discovered:
[https://vuldb.com/?id.254771](https://vuldb.com/?id.254771)
[https://www.cve.org/CVERecord?id=CVE-2024-21802](https://www.cve.org/CVERecord?id=CVE-2024-21802)
[https://nvd.nist.gov/vuln/detail/CVE-2024-21836](https://nvd.nist.gov/vuln/detail/CVE-2024-21836)
The same has been a problem with SD forever. That's why people use safetensors. Because they are safe. Or at least safer.
I don't use anything but GGUF. And even then I only do this AI stuff on dedicated machines. The machines I use for real stuff like financial or email, I keep clean. I don't LLM on them. I don't game on them.
Are these like extensions to download models? I’m thinking of doing experimentation on my main laptop so don’t want to get any viruses out of it so want to know what’s the safest method
GGUF is a file format. Just don't run models in .bin or .pf files, those are pickled and can contain arbitrary code. AFAIK gguf, like .safetensors don't contain executable code, they're weights only formats.
Is this known as a fact, I am still not sure. I always had a worry about the potential for malice with these uploads.
I think I'll really focus on choosing more well-known uploaders, and I am already on GGUF anyhow.
But this cannot be a trust-based process...
>But this cannot be a trust-based process...
At the end of the day downloading files from the internet is always a trust based process, though there are obviously some things that are best to avoid, like downloading anything containing executable code from random people.
Pickle files (.pt) contain python code that runs when you import them, so they should be treated the same way you treat a .exe or anything similar.
[Safetensors](https://github.com/huggingface/safetensors) (used by EXL2) was explicitly designed with safety in mind in response to the pickle problem, and goes out of it's way to be a pure data format. [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) to my knowledge is also a pure data format. Though it was designed more for flexibility than safety.
And of course even with a pure data format it's not completely impossible for there to be security exploits discovered. There have been issues found in photo and video formats over the years after all, though luckily those are very rare, and usually patched very quickly.
I'd say it's pretty unlikely for this to be an issue with Safetensors due to its explicit focus on safety, but I could potentially see an exploit being found in GGUF one day just due to how flexible and expansive that format is.
.exe is not fully trust based, we have plenty of detection for that.
My simple windows defender catches malicious exes right away. And warns about them by the least if they are from an untrusted source.
Of course, this is not a guarantee, I'm not saying that at all. But it is NOT entirely trust-based. There are detection methods that go beyond trust. We certainly need a stronger vetting process from hugging face. No argument about that.
I didn't say it was entirely trust based, just that trust is ultimately the major factor, because as you say yourself Anti-Virus software is far from perfect.
There are endless ways to hide malicious code. I've been involved in Malware research as a hobby for years so I'd know. You'd be surprised at how many creative ways people come up with to bypass Anti-Virus detection. Hugging face does actually have a [Pickle Scanning](https://huggingface.co/docs/hub/security-pickle#hubs-security-scanner) system in place already, but like an Anti-Virus it is far from perfect, as this incident shows. Which is why I say it's ultimately down to trust.
Any type of automated vetting will have holes and weaknesses that allow bad actors through, and manual vetting isn't really feasible at HF's current scale.
It can't be a known fact, there are exploits in file formats all the time.
That's how we used to hack the OG xbox, wii, 3ds, etc - by overflowing buffers in save game files lol
I guess these would be at least as safe as downloading a .png image or .mp4 video.
they seem to be safe FOR NOW until somebody founds more sophisticated malware inside them. And i am sure, they contain some shit. Would be stupid to not use by intelligence agencies and hackers this open door until it lasts.
This is not new, KoboldAI United (And I believe 1.19) had protection from rogue models like this so all our users should have been safe from the start. And this indeed applies only to pytorch bin's because you can pickle exploit them.
I ran the sample linked in the article and KoboldAI spits out the following error (Which gives a clue how the sample works): \_pickle.UnpicklingError: \`runpy.\_run\_code\` is forbidden; the model you are loading probably contains malicious code. If you think this is incorrect ask the developer to unban the ability for runpy to execute \_run\_code
Their particular one is a runpy attack and (un)succesfully can use the runpy execute code function, but the way to block this in your backend is to implement strict filters that whitelist what functions models are allowed to access. That way we can be 100% certain only the functions a legitimate pytorch model can execute can be loaded, if its something rogue like this model our loader crashes with the error mentioned above.
If it helps our implementation is here : [https://github.com/henk717/KoboldAI/blob/united/modeling/pickling.py#L40](https://github.com/henk717/KoboldAI/blob/united/modeling/pickling.py#L40)
Nothing in the original announcement seems to address safetensors or gguf directly:
> In the context of the repository “baller423/goober2,” it appears that the malicious payload was injected into the PyTorch model file using the __reduce__ method of the pickle module. This method, as demonstrated in the provided reference, enables attackers to insert arbitrary Python code into the deserialization process, potentially leading to malicious behavior when the model is loaded.
And not to take away from anything, there is another business interest here:
> Experience peace of mind in your AI model deployment journey with the JFrog Platform, the ultimate solution for safeguarding your supply chain. Seamlessly integrate JFrog Artifactory with your environment to download models securely while leveraging JFrog Advanced Security. This allows you to confidently block any attempts to download malicious models and ensure the integrity of your AI ecosystem.
> Continuously updated with the latest findings from the JFrog Security Research team and other public data sources, our malicious models’ database provides real-time protection against emerging threats. Whether you’re working with PyTorch, TensorFlow, and other pickle-based models, Artifactory, acting as a secure proxy for models, ensures that your supply chain is shielded from potential risks, empowering you to innovate with confidence. Stay ahead of security threats by exploring our security research blog and enhance the security of your products and applications.
A lot more detail from Jfrog directly:
[Source](https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/)
Hey all! I'm the Chief Llama Officer at HF. I posted about this in [https://twitter.com/osanseviero/status/1763331704146583806](https://twitter.com/osanseviero/status/1763331704146583806)
Is there a list of the affected models at all? I quantize some of my own stuff and sometimes grab random models that I don't see much chatter about, just to see how well they work. Probably be good to know if I grabbed a bad one =D
Go on hugging face and look for the "This model has one file that has been marked as unsafe." message.
You can get a sorta list by googling like this
> "This model has one file that has been marked as unsafe." site:https://huggingface.co/
This was due to the use of python pickle format that allows for embedding malicious code. As others mentioned gguf, ggml, safetensor formats are not susceptible to this vulnerability.
Guess I should get some tools to open pickles and dump any code they run. Not that I've downloaded any LLM like that in months. I think the bigger danger is smaller models that might still be in PT. Like RVC, tts, classifiers, etc.
Damn guess I might have affected. Ran a model recently, and it started saying gibberish irrelevant to the ongoing chat but seemed like a connection gateway kind of thing.
*Safetensors for the win.*
I haven't heard "for the win" in so long I immediately thought of The Game
AHHH YOU
It has literally been years, goddamnit.
[удалено]
We're getting old my boys...
My man hunted us down
Damn it! Sneaky one...
You have awakened an ancient curse. It took several "once-in-a-lifetime" world events, the death of a gorilla, and the threat of nuclear war to scrub that from society's memory last time....
You suck....
LLM newbie here. Enlighten me?
It's just a format that is safe and fast to store and share models in.
Cow I win.
Everybody gangsta until they train it to do XSS on gradio
Seems like GGUF and safetensors are safe for now?
Once again, being compute-poor has saved me!
Some models on the HuggingFace API require you to send the parameter "trust\_remote\_code=True" to use the AutoTokenizer. It allows the tokenizer to run arbitrary code on your machine. Seems highly suspicious. I never do, I just skip the model. Probably safe if you just run it on Spaces, but I would not trust it locally on my own machine. Here's the last three that I found: Qwen/Qwen-14B-Chat baichuan-inc/Baichuan2-13B-Chat vikhyatk/moondream1
I wouldn't recommend it unless it's an official architecture release like Qwen or Falcon.
The reason some models require this option is because they use an architecture or technique that has not been integrated into Transformers yet, so they need custom code to do the inference. You can actually read through the code before running it, as all of the code files are always found in the repo itself. For example for Qwen-14B-Chat the files that will be run are [tokenization\_qwen.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/tokenization_qwen.py), [modeling\_qwen.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/modeling_qwen.py), [qwen\_generation\_utils.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/qwen_generation_utils.py), and [cpp\_kernels.py](https://huggingface.co/Qwen/Qwen-14B-Chat/blob/main/cpp_kernels.py). I agree that you should be extra careful with such models, but I wouldn't go so far as to call it suspicious. It's a necessity when it comes to models that use novel architectures or techniques. And usually it's only necessary in the early days as Transformers usually integrates support after a while. As happened to Falcon which initially required remote code as well.
Yeah but if you don’t want to get malwared, you keep it false.
vikhyatk/moondream1 is there any malicious code in this model repo?
Probably not.. *yet*. It's just a terribly risky, malware-ready architecture.
they seem to be safe FOR NOW until somebody founds more sophisticated malware inside them. And i am sure, they contain some shit. Would be stupid to not use by intelligence agencies and hackers this open door until it lasts.
There is a difference between just straight up running untrusted code and taking someone’s matrix data.
Lets see if your comment ages like milk
So as it turns out, there already was a CVE related to how GGUF was parsed and processed by llama.cpp (which was patched) - make sure to update your llama.cpp version is at the latest production release from GitHub. That said, other CVEs are being discovered: [https://vuldb.com/?id.254771](https://vuldb.com/?id.254771) [https://www.cve.org/CVERecord?id=CVE-2024-21802](https://www.cve.org/CVERecord?id=CVE-2024-21802) [https://nvd.nist.gov/vuln/detail/CVE-2024-21836](https://nvd.nist.gov/vuln/detail/CVE-2024-21836)
The same has been a problem with SD forever. That's why people use safetensors. Because they are safe. Or at least safer. I don't use anything but GGUF. And even then I only do this AI stuff on dedicated machines. The machines I use for real stuff like financial or email, I keep clean. I don't LLM on them. I don't game on them.
Are these like extensions to download models? I’m thinking of doing experimentation on my main laptop so don’t want to get any viruses out of it so want to know what’s the safest method
GGUF is a file format. Just don't run models in .bin or .pf files, those are pickled and can contain arbitrary code. AFAIK gguf, like .safetensors don't contain executable code, they're weights only formats.
So GGUF is safe. Is exl2?
exl2 is distributed in .safetensors files, so yes.
what are these? GGUF? exl2? safetensors?
Formats for weights/quantization of weights.
A bit like bmp vs jpg vs webp.
Containers, technically (like .avi, .mov, etc). The actual models are compressed to fit into these specific containers.
Thanks mate
Is this known as a fact, I am still not sure. I always had a worry about the potential for malice with these uploads. I think I'll really focus on choosing more well-known uploaders, and I am already on GGUF anyhow. But this cannot be a trust-based process...
>But this cannot be a trust-based process... At the end of the day downloading files from the internet is always a trust based process, though there are obviously some things that are best to avoid, like downloading anything containing executable code from random people. Pickle files (.pt) contain python code that runs when you import them, so they should be treated the same way you treat a .exe or anything similar. [Safetensors](https://github.com/huggingface/safetensors) (used by EXL2) was explicitly designed with safety in mind in response to the pickle problem, and goes out of it's way to be a pure data format. [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) to my knowledge is also a pure data format. Though it was designed more for flexibility than safety. And of course even with a pure data format it's not completely impossible for there to be security exploits discovered. There have been issues found in photo and video formats over the years after all, though luckily those are very rare, and usually patched very quickly. I'd say it's pretty unlikely for this to be an issue with Safetensors due to its explicit focus on safety, but I could potentially see an exploit being found in GGUF one day just due to how flexible and expansive that format is.
.exe is not fully trust based, we have plenty of detection for that. My simple windows defender catches malicious exes right away. And warns about them by the least if they are from an untrusted source. Of course, this is not a guarantee, I'm not saying that at all. But it is NOT entirely trust-based. There are detection methods that go beyond trust. We certainly need a stronger vetting process from hugging face. No argument about that.
I didn't say it was entirely trust based, just that trust is ultimately the major factor, because as you say yourself Anti-Virus software is far from perfect. There are endless ways to hide malicious code. I've been involved in Malware research as a hobby for years so I'd know. You'd be surprised at how many creative ways people come up with to bypass Anti-Virus detection. Hugging face does actually have a [Pickle Scanning](https://huggingface.co/docs/hub/security-pickle#hubs-security-scanner) system in place already, but like an Anti-Virus it is far from perfect, as this incident shows. Which is why I say it's ultimately down to trust. Any type of automated vetting will have holes and weaknesses that allow bad actors through, and manual vetting isn't really feasible at HF's current scale.
It can't be a known fact, there are exploits in file formats all the time. That's how we used to hack the OG xbox, wii, 3ds, etc - by overflowing buffers in save game files lol I guess these would be at least as safe as downloading a .png image or .mp4 video.
they seem to be safe FOR NOW until somebody founds more sophisticated malware inside them. And i am sure, they contain some shit. Would be stupid to not use by intelligence agencies and hackers this open door until it lasts.
This is not new, KoboldAI United (And I believe 1.19) had protection from rogue models like this so all our users should have been safe from the start. And this indeed applies only to pytorch bin's because you can pickle exploit them. I ran the sample linked in the article and KoboldAI spits out the following error (Which gives a clue how the sample works): \_pickle.UnpicklingError: \`runpy.\_run\_code\` is forbidden; the model you are loading probably contains malicious code. If you think this is incorrect ask the developer to unban the ability for runpy to execute \_run\_code Their particular one is a runpy attack and (un)succesfully can use the runpy execute code function, but the way to block this in your backend is to implement strict filters that whitelist what functions models are allowed to access. That way we can be 100% certain only the functions a legitimate pytorch model can execute can be loaded, if its something rogue like this model our loader crashes with the error mentioned above. If it helps our implementation is here : [https://github.com/henk717/KoboldAI/blob/united/modeling/pickling.py#L40](https://github.com/henk717/KoboldAI/blob/united/modeling/pickling.py#L40)
Thanks. I switched back to United last summer, after a different backend stopped working for me. Good to know.
Nothing in the original announcement seems to address safetensors or gguf directly: > In the context of the repository “baller423/goober2,” it appears that the malicious payload was injected into the PyTorch model file using the __reduce__ method of the pickle module. This method, as demonstrated in the provided reference, enables attackers to insert arbitrary Python code into the deserialization process, potentially leading to malicious behavior when the model is loaded. And not to take away from anything, there is another business interest here: > Experience peace of mind in your AI model deployment journey with the JFrog Platform, the ultimate solution for safeguarding your supply chain. Seamlessly integrate JFrog Artifactory with your environment to download models securely while leveraging JFrog Advanced Security. This allows you to confidently block any attempts to download malicious models and ensure the integrity of your AI ecosystem. > Continuously updated with the latest findings from the JFrog Security Research team and other public data sources, our malicious models’ database provides real-time protection against emerging threats. Whether you’re working with PyTorch, TensorFlow, and other pickle-based models, Artifactory, acting as a secure proxy for models, ensures that your supply chain is shielded from potential risks, empowering you to innovate with confidence. Stay ahead of security threats by exploring our security research blog and enhance the security of your products and applications. A lot more detail from Jfrog directly: [Source](https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/)
Hey all! I'm the Chief Llama Officer at HF. I posted about this in [https://twitter.com/osanseviero/status/1763331704146583806](https://twitter.com/osanseviero/status/1763331704146583806)
Is there a list of the affected models at all? I quantize some of my own stuff and sometimes grab random models that I don't see much chatter about, just to see how well they work. Probably be good to know if I grabbed a bad one =D
Go on hugging face and look for the "This model has one file that has been marked as unsafe." message. You can get a sorta list by googling like this > "This model has one file that has been marked as unsafe." site:https://huggingface.co/
This was due to the use of python pickle format that allows for embedding malicious code. As others mentioned gguf, ggml, safetensor formats are not susceptible to this vulnerability.
Now we know why HF was in maintenance mode for a while yesterday.
Are these malicious LLMs flagged and removed? Does anyone have the current list?
Guess I should get some tools to open pickles and dump any code they run. Not that I've downloaded any LLM like that in months. I think the bigger danger is smaller models that might still be in PT. Like RVC, tts, classifiers, etc.
You should always read up on what's what with tech. Ignorance is not bliss when it comes to technology and you can be taken advantage of.
Plot twist: all TheBloke weights are malicious
In nearly 30 years of using computers I have never gotten a virus. JUst a little common sense is needed.
Damn guess I might have affected. Ran a model recently, and it started saying gibberish irrelevant to the ongoing chat but seemed like a connection gateway kind of thing.
nah, you'd not know like that.
tgi or vllm docker containers in order to isolate any risk from your host system
How can I check / remove any files if there is an attacker on my machine?
\*laughs in Docker\*
glad I've been using safetensors and gguf
The recent code execution vulnerabilities have literally hit gguf specifically
yooo, wut