Can You Build an ETL System Without Code?
Testing AWS Glue Studio + Amazon AppFlow for Real Data Pipelines
Introduction: Is No-Code Data Engineering Really Possible?
What if you could build an entire data pipeline — extracting data from a source system, transforming it, and loading it into cloud storage — without writing a single line of code?
For data engineers, that sounds like science fiction. We’re used to living in PySpark scripts, SQL notebooks, and shell commands. But with the rise of no-code data engineering, this idea is quickly becoming reality — especially with tools like Amazon AppFlow and AWS Glue Studio.
These AWS-native services claim to let you move and transform data visually, through drag-and-drop interfaces and point-and-click options.
In this blog post, we’ll test whether it’s truly possible to build a working ETL pipeline — entirely without code. We’ll use Salesforce as our data source, Amazon AppFlow to extract and load the data into S3, and AWS Glue Studio to transform and prepare it for analytics.
🎯 Real-World Use Case: Customer Data from Salesforce
Let’s say you work for a fast-growing e-commerce business.
Your sales and marketing teams use Salesforce CRM to manage all customer records — including names, emails, signup dates, and region information. They want this data to be available in S3, cleaned, transformed, and partitioned by region — so analysts can run queries using Athena and build dashboards in QuickSight.
But here’s the catch: the dataset includes PII (Personally Identifiable Information), and you need to mask sensitive fields like email and phone number. Plus, the marketing team doesn’t want to include inactive users in the final output.
Normally, this would require days of development work. But today, we’ll build this entire ETL flow without touching any code — just using AppFlow + Glue Studio.
Step 1: Extracting Data from Salesforce with Amazon AppFlow
Our first step is to bring customer data from Salesforce into AWS. Instead of writing a connector or using a third-party ETL tool, we use Amazon AppFlow — a no-code service that lets you securely transfer data between SaaS applications (like Salesforce) and AWS services (like S3, Redshift, EventBridge).
Connecting to Salesforce
Inside the AppFlow console:
We create a new flow.
Select Salesforce as the source system.
Authenticate using secure OAuth-based login.
Choose the object we want to pull — in this case, let’s say it’s the “Contact” or “Leads” object, which holds customer information.
Here’s a sample of the kind of data Salesforce stores:
full_name | phone | status | signup_date | region | |
---|---|---|---|---|---|
Priya Das | priya@email.com | 8882221111 | active | 2022-08-09 | India |
Mark Brown | mark@email.com | 9876543210 | inactive | 2023-06-18 | EMEA |
Sarah Paul | sarah@email.com | 1234567890 | active | 2024-12-01 | APAC |
No-Code Transformations in AppFlow
Without writing any code, we apply these built-in transformations:
Field mapping: Rename
full_name
tocustomer_name
Filtering: Exclude users where
status = inactive
Data masking: Automatically mask
email
andphone
fields for privacy
Then, we choose Amazon S3 as the destination and pick a bucket like s3://nocode-etl-demo/raw-zone
. The data can be delivered in CSV, JSON, or Parquet format.
Finally, we schedule the flow to run daily, ensuring fresh data lands in S3 every 24 hours — completely without writing code or setting up cron jobs.
✅ Result: We now have clean, filtered Salesforce data in S3.
Step 2: Transforming the Data with AWS Glue Studio
With raw customer data now available in S3, the next step is transformation — preparing it for analytics. For this, we use AWS Glue Studio, a visual interface that allows you to create ETL jobs without coding.
Inside Glue Studio, we:
Create a new visual job using the “source and target” template
Choose the S3 bucket where AppFlow saved our data
Glue automatically infers the schema (e.g., customer_name, email, region, etc.)
Visual ETL Transformations (No Code!)
Now comes the transformation logic:
DropFields Node: We remove sensitive fields like
ssn
orphone_number
entirely.ApplyMapping Node: Rename and reformat fields as needed. For example, ensure
signup_date
is ISO-formatted.Partitioning Node: Organize data by
region
for faster querying in Athena.
No Python, no PySpark — just click-and-configure.
Finally, we define the target S3 path as the “processed zone,” and select Parquet as the output format, which is optimized for columnar storage and query performance.
We hit “Run,” and Glue handles the rest — spinning up Spark jobs, transforming the data, and storing the clean result in a new S3 folder.
📊 Architecture Overview
Here’s what we’ve built — visually and completely without code:

Observations: Is This Really Production-Ready?
Surprisingly — yes, at least for a large class of use cases.
If your project involves simple extraction, filtering, renaming, and delivering to S3 or Redshift — this no-code setup is fast, secure, and repeatable.
You can schedule it, monitor it using CloudWatch, and even integrate notifications if anything fails.
✅ Where It Shines:
Daily batch pipelines
SaaS to S3 integrations (Salesforce, Zendesk, Google Analytics, Slack, etc.)
Light transformations (filtering, mapping, masking)
Easy onboarding for non-engineers
⚠️ Where It Falls Short:
Advanced joins across datasets
Dynamic logic or custom business rules
Real-time streaming or event processing
Machine learning-based transformations
For these complex cases, you’ll still need PySpark or code-based Glue jobs.
But for day-to-day marketing, sales, finance, or analytics pipelines, no-code ETL can save time, reduce errors, and make data engineering more collaborative.
Final Thoughts: No-Code ETL Is Not a Dream — It’s Already Here
With tools like Amazon AppFlow and AWS Glue Studio, data engineers can now build production-grade pipelines faster than ever — and in some cases, without writing a single line of code.
That doesn’t mean code is going away. But it means that not every data pipeline has to start with a script. You can focus your coding efforts where they matter most — and let visual tools handle the rest.
So next time you get a request like “Can we get cleaned customer data from Salesforce into S3 daily?” — consider answering it without opening your IDE.
Bonus: What Else Can You Try?
Connect AppFlow + Slack to track live campaign feedback
Use Glue Studio + Athena to power no-code dashboards
Combine with Redshift for near real-time reporting