That's the engineering spirit
L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵
(ノ◕ヮ◕)ノ*:・゚✧ ✧゚・: *ヽ(◕ヮ◕ヽ) helping robots conquer the earth and trying not to increase entropy using Python, Data Engineering and Machine Learning http://luminousmen.com License: CC BY-NC-ND 4.0
Графики
📊 Средний охват постов
📉 ERR % по дням
📋 Публикации по дням
📎 Типы контента
Лучшие публикации
20 из 20It was a long year and you still hold on to my writing? Thank you - genuinely. Now, since you've made it this far, I want to give you a gift. You know, I'm a simple man - my favorite holiday is New Year, and if you check the calendar you can guess I'm a bit happier right now. I've been writing for a long time without giving much back to you, fellow reader - I assume a data engineer, maybe a future colleague. What I write is usually deeply technical stuff, occasional rants, sometimes practical ti...
Now we have a solution
This escalated quickly 😬 https://rentahuman.ai
Derby is officially dead. And while it won't make headlines like Hadoop or Spark, this is the quiet end of an era for Java-based DBMSs. A few things to reflect on: - Born in 1997 (originally JBMS), one of the first DBMSs written in Java - Adopted by IBM, then handed off to Apache and renamed Derby - As of Oct 2025: no active maintainers, now in "read-only mode" Here's a lesser-known fact: Derby was the default embedded metastore for Apache Hive. If you've ever spun up Hive locally, or messed wit...
I cached it. Spark evicted it. We are no longer friends. Anyway, here's what Spark caching actually does under the hood: https://luminousmen.substack.com/p/spark-caching-explained-what-really
BigQuery doesn't charge for results. It charges for the work you didn't realize it was doing: https://luminousmen.substack.com/p/why-your-5-second-bigquery-query
The real reason your Spark cluster is burning money: https://luminousmen.com/post/dive-into-spark-memory