Google Coral TPU and TensorRT (Frigate + NVIDIA GPU/TensorRT)

These are two majors which allow to run object detection models. Google Coral TPU is a physical module which can be in a form of USB stick. TensorRT is a feature of GPU runtime. Both allows to run detection models on them.

Coral TPU:

And TensorRT:

Compute Capabilities requirements

CC 5.0 is required to run DeepStack and TensorRT, but 7.0 to run Ollama moondream:1.8b. Even having GPU with CC 5.0 which is minimum required to run for instance TensorRT might be not enough due to some minor differences in implementation. It is better to run on GPU with higher CC. Moreover running on CC 5.0 means that GPU is older one which leads to performance degradation even as low as having 2 or 3 camera feeds for analysis.

Running TensorRT detection models (popular ones) requires little VRAM memory, 300 – 500 MB but it requires plenty of GPU cores and supplemental physical components to be present in such GPU, with high working clocks. In other words, you can fit those models in older GPUs but it will not perform well.

Other side of the story is to run Ollama which is GenAI requiring CC 7.0 and higher. Ollama with moondream:1.8b which is the smallest available detection model still requires little more than 3GB of VRAM.

TensorRT on Geforce MX940

You can run TensorRT object detector from Frigate on NVIDIA Geforce 940MX with CC 5.0, but it will get hot at the same time you launch it. It run on driver 550 with CUDA 12.4 as follows on only one camera RTSP feed:

So this is not an option as we may burn this laptop GPU quickly. Configuration for TensorRT:

detectors:
  tensorrt:
    type: tensorrt
    device: 0

model:
  path: /config/model_cache/tensorrt/yolov7-320.trt
  input_tensor: nchw
  input_pixel_format: rgb
  width: 320
  height: 320

To start Docker container you need to pass YOLO_MODELS environment variable:

docker run -d \
  --name frigate \
  --restart=unless-stopped \
  --stop-timeout 30 \
  --mount type=tmpfs,target=/tmp/cache,tmpfs-size=1000000000 \
  --shm-size=1024m \
  --device /dev/bus/usb:/dev/bus/usb \
  --device /dev/dri/renderD128 \
  -v ./frigate-media:/media/frigate \
  -v ./frigate-config:/config \
  -v /etc/localtime:/etc/localtime:ro \
  -e FRIGATE_RTSP_PASSWORD='password' \
  -e YOLO_MODELS=yolov7x-640 \
  -p 8971:8971 \
  -p 8554:8554 \
  -p 8555:8555/tcp \
  -p 8555:8555/udp \
  --gpus all \
  ghcr.io/blakeblackshear/frigate:stable-tensorrt

Pleas notice that Docker image is different if you want to run use GPU with TensorRT than without it. It is also not possible to run hardware accelerated decoder using FFMPEG with 940MX so disable it by passing empty array:

cameras:
  myname:
    enabled: true
    ffmpeg:
      inputs:
        - path: rtsp://user:pass@addr:port/main
          roles:
            - detect
            - record
      hwaccel_args: []

However if you would like to try hardware decoder with different GPU or CPU the play with this values:

preset-vaapi
present-nvidia

TensorRT on “modern” GPU

It is the best to run TensorRT on modern GPU with highest possible CC feature set. It will run detection fast, it will not get hot as quickly. Moreover it will have hardware support for video decoding. And even more you could run GenAI on the same machine.

So the minimum for object detection with GenAI descriptions is to have 4 GB VRAM. In my case it is NVIDIA RTX 3050 Ti Mobile which runs 25% at most with 4 – 5 camera feeds.

Google Coral TPU USB module

To run Coral detector:

detectors:
  coral:
    type: edgetpu
    device: usb

But first you need to install and configure it:

sudo apt install python3-pip python3-dev python3-venv libusb-1.0-0
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt update
sudo apt install libedgetpu1-std

You can also run TPU in high power mode:

sudo apt install libedgetpu1-max

And finally configure USB:

echo 'SUBSYSTEM=="usb", ATTR{idVendor}=="1a6e", GROUP="plugdev", MODE="0666"' | sudo tee /etc/udev/rules.d/99-edgetpu-accelerator.rules
sudo udevadm control --reload-rules && sudo udevadm trigger

Remember to run Coral via USB 3.0 as running it via USB 2.0 will cause performance drop by a factor of 2 or even 3 times. Second thing, to run Coral, first plug it in. Wait until it is recognized by the system:

lsusb

At first you will see not Google, but 1a6e Global Unichip. After TPU is initialized you will see 1da1 Google Inc:

You can pass Coral TPU via Proxmox USB device, but after each Proxmox restart you need to take care of TPU initialization:

AI video surveillance with DeepStack, Python and ZoneMinder

For those using ZoneMinder and trying to figure out how to detect objects, there is deepquestai/deepstack AI model and builtin HTTP server. You can grab video frames by using ZoneMinder API or UI API:

https://ADDR/zm/cgi-bin/nph-zms?scale=100&mode=single&maxfps=30&monitor=X&user=XXX&pass=XXX

You need to specify address, monitor ID, user, password. You can also specify single frame (mode=single) or motion (mode=jpeg). Zoneminder uses internally /usr/lib/zoneminder/cgi-bin/nph-zms program binary to grab frame from configured IP ONVIF RTSP camera. It is probably to the most quickest option, but it is convenient one. Using OpenCV in Python you could also access RTSP stream and grab frames manually. However for sake of simplicity I stay with ZoneMinder nph-zms.

So lets say I have such video frame from camera:

It is simple view of street with concrete fence with some wooden boards across. Now lets say I would like to detect passing objects. First we need to install drivers, runtime and start server.

NVIDIA drivers

In my testing setup I have RTX 3050 Ti with 4GB of VRAM running Ubuntu 22 LTS desktop. By default there will not be CUDA 12.8+ drivers available. You can get up to version 550. Starting from 525 you can get CUDA 12.x. This video card has Ampere architecture with Compute Capabilities of 8.6 which translates with CUDA 11.5 – 11.7.1. However you can install drivers 570.86.16 with consists of CUDA 12.8 SDK.

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-570

To check if it is already loaded:

lsmod | grep nvidia

Docker GPU support

Native, default Docker installation does not support direct GPU usage. According to DeepStack you should run the following commands in order to configure Docker NVIDIA runtime. However ChatGPT suggests to install nvidia-container-toolkit. You can find proper explanation of differences here. At first glace it seems that those packages are correlated.

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

As we installed new driver and reconfigured Docker it is good to fire reboot. After rebooting machine, check if nvidia-smi reports proper driver and CUDA SDK versions:

sudo docker run --gpus '"device=0"' nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04 nvidia-smi

This should report output of nvidia-smi run from Docker container called nvidia/cuda. Please note that this image may differ a little bit as it changes over time. You can adjust –gpus flag in case you got more than one NVIDIA supported video card in your system.

deepquestai DeepStack

In order to run DeepStack model and API server utilizing GPU just give gpu tag and set –gpus all flag. There is also environment variable VISION-DETECTION set to True. Probably you can configure other things such as face detection, but for now I will just stick with only this one:

sudo docker run --rm --gpus all -e VISION-DETECTION=True -v localstorage:/datastore -p 80:5000 deepquestai/deepstack:gpu

Now you have running DeepStack Docker container with GPU support. Let’s check program code now.

My Vision AI source code

import requests
from PIL import Image
import urllib3
from io import BytesIO
import time
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

prefix = "data-xxx/xxx"
zmuser = "readonly"
zmpass = "readonly"
zmaddr = "x.x.x.x"
zmmoid = x
deepstackaddr = "localhost:80"

#
# DOWNLOAD AND SAVE ZONEMINDER VIDEO FRAME
#
timestamp = int(time.time() * 1000)  
url = "https://{}/zm/cgi-bin/nph-zms?scale=100&mode=single&maxfps=30&monitor={}&user={}&pass={}".format(zmaddr, zmmoid, zmuser, zmpass)
output_path = "{}_{}.jpg".format(prefix, timestamp)
response = requests.get(url, verify=False)
if response.status_code == 200:
    with open(output_path, "wb") as file:
        file.write(response.content) 
    print(f"Downloaded: {output_path}")
else:
    print("Unable to download video frame")

#
# AI ANALYSE VIDEO FRAME
#
image_data = open(output_path,"rb").read()
image = Image.open(output_path).convert("RGB")
response = requests.post("http://{}/v1/vision/detection".format(deepstackaddr),files={"image":image_data},data={"min_confidence":0.65}).json()

#
# PRINT RECOGNIZED AND PREDICTED OBJECTS
#
for object in response["predictions"]:
    print(object["label"])
print(response)

#
# CROP OBJECTS AND SAVE TO FILES
#
i = 0
for object in response["predictions"]:
    label = object["label"]
    y_max = int(object["y_max"])
    y_min = int(object["y_min"])
    x_max = int(object["x_max"])
    x_min = int(object["x_min"])
    cropped = image.crop((x_min,y_min,x_max,y_max))
    cropped.save("{}_{}_{}_{}_found.jpg".format(prefix, timestamp, i, label))
    i += 1

With this code we grab ZoneMinder video frame, save it locally, pass to DeepStack API server for model vision detection and finally we take predicted detections with text output as well as cropped images showing only detected artifacts. For instance, the whole frame was as following:

And automatically program detected and cropped the following region:

There are several, few tens I think, types/classes of object which can be detected by this AI model. It is already pretrained and I think it is closed in terms of learning and correcting detections. I will further investigate that matter of course. Maybe registration plates OCR?

Hetzner Cloud with pfSense gateway

Instead of using dedicated Hetzner servers you can use their cloud solutions for less demanding use case scenarios. The idea is to have one server for pfSense and the other one for regular private use, hidden behind gateway. It will be little bit cheaper than renting dedicated server as CX22 costs around 3 – 4 Euro per month comparing to 50 Euro at least for dedicated server. You can have up to 10 virtual cloud server in a price of single dedicated server. Choose cloud for “simple” (or rather straightforward) solutions, dedicated servers for much complicated setups with much more demand for CPU and RAM usage.

Asssuming you use locally Ubuntu 22 or 24.

Initial setup

First install brew. In case of missing python packages, install it with system packages with python- prefix.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install shell configuration:

echo 'eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"' >> /home/USER/.bashrc
eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"

Then install hcloud:

brew install hcloud

hcloud is used to manage Hetzner configuration from command line. In order to use it, create user token and create context.

Create cloud servers and networks

Create servers:

hcloud server create --name router --type cx22 --image ubuntu-24.04
hcloud server create --name private-client --type cx22 --image ubuntu-24.04

Create private network:

hcloud network create --name nat-network --ip-range 10.0.0.0/16
hcloud network add-subnet nat-network --network-zone eu-central --type server --ip-range 10.0.0.0/24

And attach servers to this network:

hcloud server attach-to-network router --network nat-network --ip 10.0.0.2
hcloud server attach-to-network private-client --network nat-network

Now important thing. In order to route private network traffic you need to add special 0.0.0.0/0 route to gateway. Now it seems that it can be only done using hcloud as web UI do not allow it:

hcloud network add-route nat-network --destination 0.0.0.0/0 --gateway 10.0.0.2

pfSense installation

This is second thing which is little complicated if you want to do it via web UI. It is highly recommended to do this via hcloud. Download pfSense packge, decompress it and pass ISO image as parameter:

hcloud server attach-iso router pfSense-CE-2.7.2-RELEASE-amd64.iso
hcloud server reset router

Now do regular pfSense installation and detach ISO:

hcloud server detach-iso router
hcloud server reset router

After rebooting pfSense:

  • configure WAN as vtnet0
  • configure LAN as vtnet1
  • uncheck Block bogon networks
  • set LAN as DHCP
  • set LAN System – Routing – Static routes from 10.0.0.0/16 to 10.0.0.1
  • disable hardware checksum offload in System – Advanced – Networking
  • set NAT outbound to hybrid mode
  • set WAN NAT outbound from 10.0.0.0/16 to any addressing WAN address
  • change LAN firewall catch all rule as from any instead of LAN subnets

Routing traffic

Once you have your pfSense router setup you need to pass traffic from private-client server to 10.0.0.1, which is Hetzner’s default gateway in every private network. We defined earlier that we want to pass all traffic via router (10.0.0.2), but it will be done in Hetzner router which will pass traffic from private-client to router and then to final destination. So we do not pass traffic directly to pfSense gateway n such configuration.

It is important to uncheck blocking bogon networks as our router will be passing private traffic in WAN interface. Setting WAN NAT outbonud is also critical in order to have this routing working. Finally, pfSense adds default LAN catch all rule, which needs to be modifyed to apply all sources and not only LAN, which allows routing traffic via Hetzner gateway.

Beforementioned aspects are outside of regular pfSense configuration as in dedicated servers, because here we have private managed network which runs with different rules, so you need to follow those rule if you want to have pfSense acting as gateway.

Afterwords

Be sure of course to setup you own desired network addressing and naming conventions. If something is unclear, refer to official Hetzner documentation which can found here. After configuring private-client you can disable public networking. Switch it off and then unlink public IP address and power if on again. Now to access private-client either use Hetzner Cloud console or setup OpenVPN in your fresh pfSense installation.

Design and 3D-print wall switch in OpenSCAD

So we decided to try Zigbee wall switches and many of available brands work with Fibaro. But not all of them and not in all configurations. Even if some switch or module works it may only work with Tuya gateway and application and not with Fibaro gateway. So there are some blanks in this concept. We left with double frame with electronic switch and one blank for shutters as there is no compatible switch or module at this time. So I thought: maybe I just 3D-print it…

OpenSCAD code:

cube(size=[4.9, 4.9, 0.2]);

translate([1.9, 1.9, 0.2]) 
   cylinder(h=0.65, r=0.24 ,$fn=100);
translate([3.0, 1.9, 0.2]) 
   cylinder(h=0.65, r=0.24 ,$fn=100);
   
translate([1.9, 3.0, 0.2]) 
   cylinder(h=0.65, r=0.24 ,$fn=100);
translate([3.0, 3.0, 0.2]) 
   cylinder(h=0.65, r=0.24 ,$fn=100);

Few test desigs:

Let’s print them:

And here we go!

Now it looks just fine.

NextCloud 26 update failed

I use NextCloud as alternative to Google Photos. My smartphone sends pictures over OpenVPN to my NextCloud instance. However for over a year Android application has been showing notification that server version is outdated, and from time to time automatic upload do not work. So I decided that it is the time to do an upgrade. It did not work well:

I tried to run upgrade using UI. It got stuck in various places, and finally I got this:

But there is much simpler option. Install fresh copy of NextCloud. I prefer using VM ISO (https://www.turnkeylinux.org/nextcloud) instead of LXC containers. I feel that sometimes LXC container draws too much constrains and limitations, such as quircky backups, kernel features etc. Once you have clean install copy data from original location to /var/www/nextcloud-data/ and run the following command:

sudo -u www-data php occ files:scan --all

Be sure to install sudo first (apt install sudo). That’s all. I use only NextCloud as automatic photos and videos upload co my case is quite simple. If you use various different integrations, calendars, mail etc, then your migration scenario is much more complicated.

Please note that I tried fixing all these upgrade problems with permissions, missing files etc, but after and hour it got permanently stuck, so I decided to go with easy way.

Elasticsearch & Kibana: version 8.x installation

At Indatify we use PostgreSQL for most of the time when dealing with data. We tried to use Cassandra, but for now it is too much of a constraint in such dynamic data environment. We put it on shelf. However, for textual data we choose Elasticsearch, because we know it and it provides full text search out of the box. Later, we will come back to Cassandra, but with more specified use case, as it requires precise data model to be predefined by query and not by structure.

So, to install Elasticsearch (if running without sudo, then run from root):

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
sudo apt-get install apt-transport-https
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update && sudo apt-get install elasticsearch

Now, for local development environment (only) we will disable xpack:

sudo vi /etc/elasticsearch/elasticsearch.yml

Configure the following:

networkhost 0.0.0.0
xpack.security.enabled false

And then:

sudo systemctl enable elasticsearch && sudo systemctl start elasticsearch

To verify if it works fine:

curl -XGET "localhost:9200"

Now, to install Kibana:

sudo apt install kibana

And then, configure it:

sudo vi /etc/kibana/kibana.yml

With the following:

server.port 5601
server.host "0.0.0.0"
elasticsearch.hosts: ["http://localhost:9200"]

Finally, start Kibana:

sudo systemctl enable kibana && sudo systemctl start kibana

We are good to go. Remember to allocate at least 8GB of memory, 4GB is too little. More you have memory, more Elastic will put into it and run more quickly, instead of loading data directly from drives.