LM
Lincoln Macedo
Data Engineer · Analytics Engineer

Lincoln Macedo

Building production-grade data pipelines and modern analytics infrastructure.

5+ years designing ingestion pipelines, dbt transformation layers, and serving APIs across BigQuery, Snowflake, Databricks, and AWS. I care about clean architecture, reliability, and meaningful data.

PythonSQLBigQueryDatabricksGCPAWSdbtAirflow
Lincoln Macedo
01 — About

Lincoln Macedo

Data Engineer and Analytics Engineer based in Brazil. I design and build production-grade data systems — from raw ingestion to transformation layers and serving APIs — across GCP, Snowflake, Databricks, and AWS.

I write Python and SQL professionally, model data with dbt, orchestrate with Airflow, and stream events with Kafka. I care about reliability, clean architecture, and documentation that outlasts the code.

Currently building an open-source data engineering portfolio across multiple platforms, documenting modern data stack patterns end-to-end.

Location
Brazil → Europe
Available for
Remote & International
Focus
Data Engineering
Experience
5+ years
02 — Stack

Technologies

Tools and platforms I work with professionally and in open-source projects.

Language
PythonSQL
Platform
BigQueryDatabricksSnowflakeGCPAWS
Tool
dbtAirflowKafkaTerraformDockerPySpark
03 — Work

Portfolio Projects

A multi-platform data engineering portfolio — ingestion, transformation, serving, and streaming patterns across GCP, Snowflake, Databricks, and AWS.

ingestion

E-commerce Ingestion

In Progress

Ingestion pipeline loading the Olist Brazilian E-commerce dataset — 9 CSV files, ~100K orders — into BigQuery raw tables. Python with retry logic and schema validation.

PythonBigQueryGCPDocker
ingestion-bigquery-ecommerce
transform

E-commerce Analytics

In Progress

dbt transformation layer on BigQuery. Staging models with type casting and null handling, intermediate joins, and fact/dimension marts for e-commerce analytics.

dbtSQLBigQueryGCP
transform-bigquery-ecommerce
ingestion

Retail Data Warehouse

In Progress

Snowpark Python pipeline loading the Online Retail II dataset (~1M invoice rows) into Snowflake RAW schema. Native Python execution inside Snowflake at no extra compute cost.

PythonSnowparkSnowflake
ingestion-snowflake-retail
transform

Logistics Data Platform

Planned

PySpark medallion architecture (bronze → silver → gold) on Databricks Community Edition. Processing NYC Yellow Taxi trip data — ~3M rows/month in Parquet format.

PySparkDatabricksSparkSQLDelta Lake
transform-databricks-logistics
streaming

Real-time Order Streaming

Planned

PySpark Structured Streaming consuming Kafka topics with simulated order events. Continuous pipeline to BigQuery and Delta Lake running via Docker Compose locally.

PySparkKafkaBigQueryDocker
streaming-kafka-ecommerce
serving

Analytics REST API

Planned

FastAPI application exposing BigQuery mart tables via REST endpoints. Deployed on Cloud Run with async BigQuery client, pagination, and auto-generated OpenAPI documentation.

PythonFastAPIBigQueryCloud Run
serving-bigquery-ecommerce

Each repository follows a strict naming convention: {layer}-{platform}-{domain}. Ingestion, transform, and serving deploy independently — different CI/CD, different cadence, different ownership.

04 — Contact

Get in touch

Open to remote data engineering roles and international opportunities — especially in Europe. Feel free to reach out.