AI/ML – MICHAŁ SOBCZAK

AI/ML

LLM training parameters explanation

2025-10-252025-10-25 1 Min Reading

Quick overview of LLM MLX LORA training parameters. weight_decay A regularization technique that adds a small penalty to the weights during training to prevent them from growing too large, helping to reduce overfitting. Often implemented as L2 regularization.examples: 0.00001 – 0.01 grad_clip Short for gradient clipping — a method that limits (clips) the size of gradients during backpropagation to prevent exploding gradients and stabilize training.examples: 0.1 – 1.0 rank Refers to the dimensionality or the number of independent directions in a matrix or tensor. In low-rank models, it controls how much the model compresses or approximates the original data.examples: 4,

AI/ML

Full LLM fine-tuning using transformers, torch and accelerate with HF and GGUF

2025-10-24 7 Min Reading

Full fine-tuning of mlx-community/Qwen2.5-3B-Instruct-bf16 Recently I posted article on how to train LORA MLX LLM here. Then I asked myself how can I export or convert such MLX model into HF or GGUF format. Even that MLX has such option to export MLX into GGUF most of the time it is not supported by models I have been using. From what I recall even if it does support Qwen it is not version 3 but version 2 and quality suffers by such conversion. Do not know why exactly it works like that. So I decided to give a try with

AI/ML

Train LLM on Mac Studio using MLX framework

2025-10-082025-10-08 6 Min Reading

I have done over 500 training sessions using Qwen2.5, Qwen3, Gemma and plenty other LLM publicly available to inject domain specific knowledge into the model’s low rank adapters (LORA). However, instead of giving you tons of unimportant facts I will just stick to the most important things. Starting with the fact that I have used MLX on my Mac Studio M2 Ultra as well as on MacBook Pro M1 Pro. Both fit well to this task in terms of BF16 speed as well as unified memory capacity and speed (up to 800GB/s). Memory speed is the most important factor comparing

AI/ML

YOLOx ONNX models use in Frigate

2025-09-042025-09-04 1 Min Reading

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our report on Arxiv. https://yolox.readthedocs.io/en/latest/demo/onnx_readme.html https://github.com/Megvii-BaseDetection/YOLOX/tree/main/demo/ONNXRuntime To configure this in Frigate:

AI/ML

Generating AI video with FramePack

2025-04-302025-04-30 3 Min Reading

Upload image, enter text prompt and press Start Generation. It is as easy as it looks like. So we take some pre-trained models, feed it with some text prompt and starting image and things happen on GPU side to generate frame by frame and merge it into motion picture. It is sometimes funny, creepy but every time it is interesting to see live coming into still pictures and making video out of them. User Interface On the left you upload starting image and write prompt below it describing what it should look like in video output. Once started, do to

AI/ML

GPU pass-thru in Proxmox 7 and Ubuntu 20, follow-up

2025-04-25 1 Min Reading

In previous article about GPU pass-thru which can found here, I described how to setup things mostly from Proxmox perspective. However from VM perspective I would like to make a little follow-up, just to make things clear about it. It has been told that you need to setup q35 machine with VirtIO-GPU and UEFI. It is true, but the most important thing is to actuall disable secure boot, which effectively prevents from loading NVIDIA driver modules. Add EFI disk, but do not check “pre-enroll keys”. This option would enroll keys and enable secure boot by default. Just add EFI disk

AI/ML

Mattermost AI chatbot with image generation support from Automatic1111

2025-04-032025-04-03 2 Min Reading

How about AI chatbot integraton in you Mattermost server? With possiblity to generate images using StableDiffusion… So, here is my Indatify’s Mattermost server which I have been playing around for last few nights. It is obvious that interaction with LLM model and generating images is way more playful in Mattermost than using Open WebUI or other TinyChat solution. So here you have an example of such integration. It is regular Mattermost on-premise server: Mattermost First, we need to configure Mattermost to be able to host AI chatbots. Configure Bot account Enable bot account creation, which is disabled by default. Of

AI/ML

Configuring NVIDIA RTX A6000 ADA in Ubuntu 22

2025-03-25 1 Min Reading

I thought that installing NVIDIA RTX A6000 ADA in default Ubuntu 22 server installation would be an easy one. However, installing drivers from the repository made no good. I verified if secure boot is enable and no it was disabled. We need to install few things first: We need to get rig of previously installed drivers: Verify if secure boot is disabled: Get NVIDIA driver, such as NVIDIA-Linux-x86_64-535.216.01.run from their webiste and install it: In case you got rid of previously installed drivers, disabled secure boot and installed build tools, kernel headers… you will be good to go to compile

AI/ML

“You’re trying to frame the request as a documentary photograph”

2025-03-222025-03-23 4 Min Reading

LLMs contain built-in policies for protecting minors, animals etc. Monkey eating sausage should be against policy. But it can be fooled and finally model stops complaining and describe what we want to. Tried: to generate funny/controversial pictures. Actuall image generate takes place at Stable Diffusion and not at those conversational LLMs. However, once aksed to generate something dubious or funny they tend to reject such requests hiding befind their policies. Refusals from nexusraven and granite3-dense First I asked for Proboscis Monkey holding can of beer and eating sausage. LLM model called nexusraven refused with that request: nexusraven: I cannot fulfill