Posts by Tag

Architecture

Programming language doesn’t matter

1 minute read

A few days ago I participated in quick presentation of significant e-commerce platform. The custom platform implemented mostly in PHP and designed as scalab...

Release It! – book review

7 minute read

Recently I read excellent book Release It! written by Michael Nygard. The book is 7 years old and I don’t know how I could miss the book until now.

The Twelve-Factor App – part 1

4 minute read

During my studies about “Micro Services” I found comprehensive (but short) document about Twelve-Factor App methodology for building software-as-a-service ap...

How to send email from JEE application

3 minute read

Sending email notifications from enterprise application is very common scenario. I know several methods to solve this puzzle, below you can find short summa...

DDD Architecture Summary

5 minute read

In this blog post you can find my general rules for implementing system using Domain Driven Design. Don’t use them blindly but it’s good starting point for ...

Pure JEE or Spring Framework

5 minute read

During my career as J2EE and JEE software developer I have been trying to use pure JEE two o three times. And I decided to don’t repeat this exercise any mo...

Back to Top ↑

Software Engineering

Spark and Spark Streaming unit testing

11 minute read

When you develop a distributed system, it’s crucial to make it easy to test. Execute tests in a controlled environment, ideally from your IDE. Long develop-t...

Release It! – book review

7 minute read

Recently I read excellent book Release It! written by Michael Nygard. The book is 7 years old and I don’t know how I could miss the book until now.

How to document your professional experiences

1 minute read

Have you considered what’s important for prospective employer? What’s the most valuable information source about your professional experience? How to docume...

Code coverage for managers and developers

2 minute read

From time to time, people ask me what code coverage by tests should be. Does 60% mean that project is healthy? Or maybe the goal should be 70% or 80%?

How to convince your manager to adopt Git

4 minute read

Distributed Concurrent Versions Systems (DCVSs) like Git or Mercurial has changed software delivery processes significantly. I would not want to go back to t...

Back to Top ↑

Apache Spark

GCP Dataproc and Apache Spark tuning

8 minute read

Dataproc is a fully managed and highly scalable Google Cloud Platform service for running Apache Spark. However, “managed” doesn’t relieve you from the prope...

Apache BigData Europe 2016

10 minute read

Last week I attended Apache Big Data Europe held in Sevilla, Spain. The event concentrates around big data projects under Apache Foundation umbrella. Below...

Spark and Kafka integration patterns – part 2

21 minute read

In the world beyond batch, streaming data processing is a future of dig data. Despite of the streaming framework using for data processing, tight integrati...

Spark and Kafka integration patterns – part 1

less than 1 minute read

I published post on the allegro.tech blog, how to integrate Spark Streaming and Kafka. In the blog post you will find how to avoid java.io.NotSerializableExc...

Spark and Spark Streaming unit testing

11 minute read

When you develop a distributed system, it’s crucial to make it easy to test. Execute tests in a controlled environment, ideally from your IDE. Long develop-t...

Back to Top ↑

Scala

Stream processing – part 2

23 minute read

This is the second part of the stream processing blog post series. In the first part I presented aggregations in a fixed, non-overlapping windows. Now you wi...

Stream processing – part 1

17 minute read

This is the very first part of the stream processing blog post series. From the series you will learn how to develop and test stateful streaming data pipelin...

Kafka Streams DSL vs processor API

29 minute read

Kafka Streams is a Java library for building real-time, highly scalable, fault-tolerant, distributed applications. The library is fully integrated with Kaf...

Spark and Kafka integration patterns – part 2

21 minute read

In the world beyond batch, streaming data processing is a future of dig data. Despite of the streaming framework using for data processing, tight integrati...

Spark and Kafka integration patterns – part 1

less than 1 minute read

I published post on the allegro.tech blog, how to integrate Spark Streaming and Kafka. In the blog post you will find how to avoid java.io.NotSerializableExc...

Spark and Spark Streaming unit testing

11 minute read

When you develop a distributed system, it’s crucial to make it easy to test. Execute tests in a controlled environment, ideally from your IDE. Long develop-t...

Back to Top ↑

Stream Processing

Apache Beam SQL

36 minute read

If you are a BigData engineer who develops batch data pipelines, you might often hear that stream processing is the future. It unlocks the full potential of...

Stream processing – part 2

23 minute read

This is the second part of the stream processing blog post series. In the first part I presented aggregations in a fixed, non-overlapping windows. Now you wi...

Stream processing – part 1

17 minute read

This is the very first part of the stream processing blog post series. From the series you will learn how to develop and test stateful streaming data pipelin...

Kafka Streams DSL vs processor API

29 minute read

Kafka Streams is a Java library for building real-time, highly scalable, fault-tolerant, distributed applications. The library is fully integrated with Kaf...

Spark and Kafka integration patterns – part 2

21 minute read

In the world beyond batch, streaming data processing is a future of dig data. Despite of the streaming framework using for data processing, tight integrati...

Spark and Kafka integration patterns – part 1

less than 1 minute read

I published post on the allegro.tech blog, how to integrate Spark Streaming and Kafka. In the blog post you will find how to avoid java.io.NotSerializableExc...

Back to Top ↑

GCP

Apache Beam SQL

36 minute read

If you are a BigData engineer who develops batch data pipelines, you might often hear that stream processing is the future. It unlocks the full potential of...

Apache Beam Summit 2022

15 minute read

Last week I virtually attended Apache Beam Summit 2022 held in Austin, Texas. The event concentrates around Apache Beam and the runners like Dataflow, Flink...

GCP Dataproc and Apache Spark tuning

8 minute read

Dataproc is a fully managed and highly scalable Google Cloud Platform service for running Apache Spark. However, “managed” doesn’t relieve you from the prope...

GCP Cloud Composer and Apache Airflow tuning

15 minute read

I would love to only develop streaming pipelines but in reality some of them are still batch oriented. Today you will learn how to properly configure Google ...

Back to Top ↑

Performance

GCP Dataproc and Apache Spark tuning

8 minute read

Dataproc is a fully managed and highly scalable Google Cloud Platform service for running Apache Spark. However, “managed” doesn’t relieve you from the prope...

GCP Cloud Composer and Apache Airflow tuning

15 minute read

I would love to only develop streaming pipelines but in reality some of them are still batch oriented. Today you will learn how to properly configure Google ...

Kafka Streams DSL vs processor API

29 minute read

Kafka Streams is a Java library for building real-time, highly scalable, fault-tolerant, distributed applications. The library is fully integrated with Kaf...

Artifactory Performance Tuning

4 minute read

Few years ago I participated in Kirk Pepperdine Java performance tuning training. One of the greatest technical training which I have ever been! And also g...

Back to Top ↑

Apache Kafka

Kafka Streams DSL vs processor API

29 minute read

Kafka Streams is a Java library for building real-time, highly scalable, fault-tolerant, distributed applications. The library is fully integrated with Kaf...

Apache BigData Europe 2016

10 minute read

Last week I attended Apache Big Data Europe held in Sevilla, Spain. The event concentrates around big data projects under Apache Foundation umbrella. Below...

Spark and Kafka integration patterns – part 2

21 minute read

In the world beyond batch, streaming data processing is a future of dig data. Despite of the streaming framework using for data processing, tight integrati...

Spark and Kafka integration patterns – part 1

less than 1 minute read

I published post on the allegro.tech blog, how to integrate Spark Streaming and Kafka. In the blog post you will find how to avoid java.io.NotSerializableExc...

Back to Top ↑

Spring

How to send email from JEE application

3 minute read

Sending email notifications from enterprise application is very common scenario. I know several methods to solve this puzzle, below you can find short summa...

DDD Architecture Summary

5 minute read

In this blog post you can find my general rules for implementing system using Domain Driven Design. Don’t use them blindly but it’s good starting point for ...

Pure JEE or Spring Framework

5 minute read

During my career as J2EE and JEE software developer I have been trying to use pure JEE two o three times. And I decided to don’t repeat this exercise any mo...

Back to Top ↑

Apache Beam

Apache Beam SQL

36 minute read

If you are a BigData engineer who develops batch data pipelines, you might often hear that stream processing is the future. It unlocks the full potential of...

Apache Beam Summit 2022

15 minute read

Last week I virtually attended Apache Beam Summit 2022 held in Austin, Texas. The event concentrates around Apache Beam and the runners like Dataflow, Flink...

Stream processing – part 2

23 minute read

This is the second part of the stream processing blog post series. In the first part I presented aggregations in a fixed, non-overlapping windows. Now you wi...

Stream processing – part 1

17 minute read

This is the very first part of the stream processing blog post series. From the series you will learn how to develop and test stateful streaming data pipelin...

Back to Top ↑

Java

Development Environment Setup

3 minute read

This document is a manual how to configure flexible development environment for Java, JavaScript, Ruby and Python - my primary set of tools. Even if the runt...

Pure JEE or Spring Framework

5 minute read

During my career as J2EE and JEE software developer I have been trying to use pure JEE two o three times. And I decided to don’t repeat this exercise any mo...

Artifactory Performance Tuning

4 minute read

Few years ago I participated in Kirk Pepperdine Java performance tuning training. One of the greatest technical training which I have ever been! And also g...

Back to Top ↑

Linux

Development Environment Setup

3 minute read

This document is a manual how to configure flexible development environment for Java, JavaScript, Ruby and Python - my primary set of tools. Even if the runt...

Virtual Box VDI maintenance

1 minute read

Virtual Disk Image (VDI) is a Virtual Box container format for guest hard disk. I found that VDI files on the host system grows over the time. If your VDI fi...

Artifactory Performance Tuning

4 minute read

Few years ago I participated in Kirk Pepperdine Java performance tuning training. One of the greatest technical training which I have ever been! And also g...

Back to Top ↑

Dataproc

GCP Dataproc and Apache Spark tuning

8 minute read

Dataproc is a fully managed and highly scalable Google Cloud Platform service for running Apache Spark. However, “managed” doesn’t relieve you from the prope...

Back to Top ↑

Git

GitFlow step by step

4 minute read

Git Flow is a mainstream process for branch per feature development. Git Flow is the best method I’ve found for managing project developed by small to medi...

How to convince your manager to adopt Git

4 minute read

Distributed Concurrent Versions Systems (DCVSs) like Git or Mercurial has changed software delivery processes significantly. I would not want to go back to t...

Back to Top ↑

Node.js

Development Environment Setup

3 minute read

This document is a manual how to configure flexible development environment for Java, JavaScript, Ruby and Python - my primary set of tools. Even if the runt...

Back to Top ↑

DDD

DDD Architecture Summary

5 minute read

In this blog post you can find my general rules for implementing system using Domain Driven Design. Don’t use them blindly but it’s good starting point for ...

Back to Top ↑

Apache Hadoop

Apache BigData Europe 2016

10 minute read

Last week I attended Apache Big Data Europe held in Sevilla, Spain. The event concentrates around big data projects under Apache Foundation umbrella. Below...

Back to Top ↑

Dataflow

Apache Beam SQL

36 minute read

If you are a BigData engineer who develops batch data pipelines, you might often hear that stream processing is the future. It unlocks the full potential of...

Back to Top ↑

Python

Development Environment Setup

3 minute read

This document is a manual how to configure flexible development environment for Java, JavaScript, Ruby and Python - my primary set of tools. Even if the runt...

Back to Top ↑

Ruby

Development Environment Setup

3 minute read

This document is a manual how to configure flexible development environment for Java, JavaScript, Ruby and Python - my primary set of tools. Even if the runt...

Back to Top ↑

JBehave

Back to Top ↑

PHP

Programming language doesn’t matter

1 minute read

A few days ago I participated in quick presentation of significant e-commerce platform. The custom platform implemented mostly in PHP and designed as scalab...

Back to Top ↑

Cloud Composer

GCP Cloud Composer and Apache Airflow tuning

15 minute read

I would love to only develop streaming pipelines but in reality some of them are still batch oriented. Today you will learn how to properly configure Google ...

Back to Top ↑

Apache Airflow

GCP Cloud Composer and Apache Airflow tuning

15 minute read

I would love to only develop streaming pipelines but in reality some of them are still batch oriented. Today you will learn how to properly configure Google ...

Back to Top ↑

BigQuery

Back to Top ↑

Terraform

Back to Top ↑

SQL

Apache Beam SQL

36 minute read

If you are a BigData engineer who develops batch data pipelines, you might often hear that stream processing is the future. It unlocks the full potential of...

Back to Top ↑