AI/ML

Qwen LLM

What is Qwen?

This is the organization of Qwen, which refers to the large language model family built by Alibaba Cloud. In this organization, we continuously release large language models (LLM), large multimodal models (LMM), and other AGI-related projects. Check them out and enjoy!

What models do they provide?

They provide wide range of models, since 2023. Original model was just called Qwen and can be still found on GitHub. The current model Qwen2.5 has its own repository, also on GitHub. General purpose models are just Qwen, but there are also code specific models. There are also Math, Audio and few other.

Note from the creator:

We do not recommend using base language models for conversations. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., or fill in the middle tasks on this model.

However, I tried the following models, because I can:

  • Qwen/Qwen-7B: 15 GB of model, 31 GB in RAM
  • Qwen/Qwen2-0.5B, 1 GB of model, 4 GB in RAM
  • Qwen/Qwen2.5-Coder-1.5B, 3 GB of model, 7 GB in RAM

Yes, you can run those models solely in memory rather than on GPU. This will be significantly slower, but it works.

How to run?

In order to validate the source of data which have been used for training I think we can ask something domain-specific:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen2.5-Coder-1.5B"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

input_text = "How to install Proxmox on Hetzner bare-metal server?"
inputs = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=200)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

This model uses Qwen2ForCausalLM architecture and it is released under Apache 2.0 licence. To run it we need to have few additional Python packages installed:

transformers>=4.32.0,<4.38.0
accelerate
tiktoken
einops
transformers_stream_generator==0.0.4
scipy

Where did it get the data from?

So the output for given question “How to install Proxmox on Hetzner bare-metal server?”

wget https://enterprise.proxmox.com/debian/proxmox-ve-release-6.x.gpg -O /etc/apt/trusted.gpg.d/proxmox-ve-release-6.x.gpg
echo "deb http://enterprise.proxmox.com/debian/pve buster pve-no-subscription" > /etc/apt/sources.list.d/pve-enterprise.list
apt-get update
apt-get install proxmox-ve

It suggests installing Proxmox 6 even if Proxmox 7 is already outdated as for 2024. Moreover it suggests running Debian Buster and specific hardware setup with 16 GB of RAM and 2 x 1TB HDD. It seems like some sort of forum or stackexchange or stackoverflow thing. It might be also a compilation or translation of few other as the small size of the model implies.

Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package proxmox-ve

It is no brainer: this is offline thing. It’s very interesting that it is still able to try to answer even if it is not precise.