Month: September 2024

Technology

Casssandra: the introduction

Distributed, partitioned, multi-master, increment horizontal scale-out NoSQL data manegement system for global mission-critical use cases handling petabyte sized datasets Dynamo, Amazon, Facebook, Apache… “Reliability at massive scale is one of the biggest challenges we face at Amazon“ source: “Dynamo: amazon’s highly available key-value store” In 2004 there have been performance issues in Amazon e-commerce handling due to high traffic. By 2007 the concept of Dynamo has been materialized as Amazon S3. Then in 2008 Facebook with co-authors of Amazon Dynamo developed its own distributed NoSQL system, Cassandra. In 2009 Cassandra became Apache’s project. However, Amazon’s DynamoDB came in 2012, it

Technology

PostgreSQL manual partitioning

Have you ever wondered how many tables can we create and use in PostgreSQL database server? Shall we call them partitions or shards? Why not to use built-in “automatic” partitioning? Partitions or shards? Lets first define the difference between partitions and shards. Partitions are placed on the same server, but shards can be spread across various machines. We can use inheritance or more recent “automatic” partitioning. However both of these solutions lead to tight join with PostgreSQL RDBMS, which in some situations we would like to avoid. Imagine a perspective of migrating our schemas to different RDBMS like Microsoft SQL

uncategorized

About me

Hello. I’m Michael, I’m IT professional and enthusiast. I graduated computer programming as well as economics at Szkoła Główna Handlowa in Warsaw. I’m author of 5 books concerning software design, development, quality assurance and performance. Since 2005 I have been working with many different companies providing them with various aspects of software development process. I’m specifically interested in corporate architecture but also in bootstraping new startup ideas. My motto is “getting things done” and I like to learn new things. Would you like to build something special? Contact me.

AI/ML

Block AI web-scrapers from stealing your website content

Did you know that you may block AI-related web-scrapers from downloading your whole websites and actually stealing your content. This way LLM models will need to have different data source for learning process! Why you may ask? First of all, AI companies make money on their LLM, so using your content without paying you is just stealing. It applies for texts, images and sounds. It is intellectual property which has certain value. Long time ago I placed on my website a license “Attribution-NonCommercial-NoDerivatives” and guest what… it does not matter. I did not receive any attribution. Dozens of various bot

Technology

How to build computer inside computer?

Even wondered how computer is built? And no, I’m not talking about unscrewing your laptop… but exactly how the things happen inside the CPU. If so, then check out TINA from Texas Instruments and open my custom-made all-in-one computer. I spend few weeks preparing this schematic. It contains clock, program counter, memory address register, RAM, ALU, A&B registers, instruction register, microcode decoder, instruction register, address register and program counter. Well that’s a lot ot stuff you need to build 8-bit data and 4-bit address computer, even in simulator. Sample program in my assembly + binary representation, which needs to be

AI/ML

BLOOM LLM: how to use?

Asking BLOOM-560M “what is love?” it replies with “The woman who had my first kiss in my life had no idea that I was a man”. wtf?! Intro I’ve been into parallel computing since 2021, playing with OpenCL (you can read about it here), looking for maximizing devices capabilities. I’ve got pretty decent in-depth knowledge about how computational process works on GPUs and I’m curious how the most recent AI/ML/LLM technology works. And here you have my little introduction to LLM topic from practical point-of-view. Course of Action What is BLOOM? It is a BigScience Large Open-science Open-access Multilingual language

Technology

Big-tech cloud vs competitors: price-wise

Imagine you would like run 2 x vCPU and 4 GB RAM virtual machine. Which service provider do you choose? Azure, AWS or Hetzner? With AWS you pay 65 USD (c5.large instance… and by the way why this is called large at all?). You pick Microsoft Azure you pay 36 Euro (B2s). If you would pick DigitealOcean then you pay 24 USD (noname “droplet”). Choosing Scaleway you pay 19 Euro (PLAY2-NANO compute instance in Warsaw DC). However, with Hetzner Cloud you pay as little as 4.51 Euro (CX22 virtual server). How is that even possible? So it goes like this

Technology

Who’s got the biggest load average?

Ever wondered what can be the highest load average on the unix-like system? Do we even know what this parameter tells about? It shows the average number of either actively running or waiting processes. It should be close to the number of logical processors present on the system, otherwise, in case it is greater than this, some things will need to wait in order to be executed. So I was testing 1000 LXC containers on the 2 x 6 core Xeon system (totalling as 24 logical processors) and leave it for a while. Once I got back I saw that

Security

Make Opera browser great again!

Get rid of ads and stop sending your data for free to Opera Little history My favourite browser was Opera for so many years. Between 2000 and 2005 it was adware showing, well… ads. In 2005 ads have been remove as the financing came from Google, Opera’s default search engine. In 2013 Opera dropped its own rendering engine in favor to Chromium. In 2023 Opera gets some AI features. What is all about? I still like Opera. It has this great multi workspace feature, battery saving mode and in general it is much more capable of running plenty of tabs