Data Engineer · Analytics Engineer

Lincoln Macedo

Building production-grade data pipelines
and modern analytics infrastructure.

5+ years designing ingestion pipelines, dbt transformation layers, and serving APIs across BigQuery, Snowflake, Databricks, and AWS. I care about clean architecture, reliability, and meaningful data.

PythonSQLBigQueryDatabricksGCPAWSdbtAirflow

View Work→GitHub ↗LinkedIn ↗

Brazil → Europe

01 — About

Lincoln Macedo

Data Engineer and Analytics Engineer based in Brazil. I design and build production-grade data systems — from raw ingestion to transformation layers and serving APIs — across GCP, Snowflake, Databricks, and AWS.

I write Python and SQL professionally, model data with dbt, orchestrate with Airflow, and stream events with Kafka. I care about reliability, clean architecture, and documentation that outlasts the code.

Currently building an open-source data engineering portfolio across multiple platforms, documenting modern data stack patterns end-to-end.

Location

Brazil → Europe

Available for

Remote & International

Focus

Data Engineering

Experience

5+ years

02 — Stack

Technologies

Tools and platforms I work with professionally and in open-source projects.

Language

PythonSQL

Platform

BigQueryDatabricksSnowflakeGCPAWS

Tool

dbtAirflowKafkaTerraformDockerPySpark

03 — Work

Portfolio Projects

A multi-platform data engineering portfolio — ingestion, transformation, serving, and streaming patterns across GCP, Snowflake, Databricks, and AWS.

ingestion

E-commerce Ingestion

In Progress

Ingestion pipeline loading the Olist Brazilian E-commerce dataset — 9 CSV files, ~100K orders — into BigQuery raw tables. Python with retry logic and schema validation.

PythonBigQueryGCPDocker

ingestion-bigquery-ecommerce↗

transform

E-commerce Analytics

In Progress

dbt transformation layer on BigQuery. Staging models with type casting and null handling, intermediate joins, and fact/dimension marts for e-commerce analytics.

dbtSQLBigQueryGCP

transform-bigquery-ecommerce↗

ingestion

Retail Data Warehouse

In Progress

Snowpark Python pipeline loading the Online Retail II dataset (~1M invoice rows) into Snowflake RAW schema. Native Python execution inside Snowflake at no extra compute cost.

PythonSnowparkSnowflake

ingestion-snowflake-retail↗

transform

Logistics Data Platform

Planned

PySpark medallion architecture (bronze → silver → gold) on Databricks Community Edition. Processing NYC Yellow Taxi trip data — ~3M rows/month in Parquet format.

PySparkDatabricksSparkSQLDelta Lake

transform-databricks-logistics↗

streaming

Real-time Order Streaming

Planned

PySpark Structured Streaming consuming Kafka topics with simulated order events. Continuous pipeline to BigQuery and Delta Lake running via Docker Compose locally.

PySparkKafkaBigQueryDocker

streaming-kafka-ecommerce↗

serving

Analytics REST API

Planned

FastAPI application exposing BigQuery mart tables via REST endpoints. Deployed on Cloud Run with async BigQuery client, pagination, and auto-generated OpenAPI documentation.

PythonFastAPIBigQueryCloud Run

serving-bigquery-ecommerce↗

Each repository follows a strict naming convention: {layer}-{platform}-{domain}. Ingestion, transform, and serving deploy independently — different CI/CD, different cadence, different ownership.

04 — Contact

Get in touch

Open to remote data engineering roles and international opportunities — especially in Europe. Feel free to reach out.

GitHub

github.com/lincolnmacedo

↗

linkedin.com/in/lincolnmacedo

↗

lincoln@lincolnmacedo.com.br

↗