Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Building a Batch Data Pipeline with Athena and MySQL | by 💡Mike Shakhomirov | Oct, 2023

Sed ut perspiciatis unde. In this story I will speak about one of the most popular ways to run data transformation tasks — Batch data processing. This data Pipeline design pattern becomes incredibly useful when we need to process data in chunks making it very efficient for ETL jobs that require scheduling. I will demonstrate how it can be achieved by building a data transformation pipeline using MySQL and Athena. We will use infrastructure as code to deploy it in the cloud.Imagine that you have just joined a company as a Data Engineer. Their data stack is modern, event-driven, cost-effective, flexible, and can scale easily to meet the growing data resources you have. External data sources and data pipelines in your data platform are managed by the data engineering team using a flexible environment setup with CI/CD GitHub integration.As a data engineer you need to create a business intelligence dashboard that displays the geography of company revenue streams as shown below. Raw payment data is stored in the server database (MySQL). You want to build a batch pipeline that extracts data from that database daily, then use AWS S3 to store data files and Athena to process it.A data pipeline can be considered as a sequence of data processing steps. Due to logical data flow connections between these stages, each stage generates an output that serves as an input for the following stage.There is a data pipeline whenever there is data processing between points A and B.Data pipelines might be different due it their conceptual and logical nature. I previously wrote about it here [1]:Source link Save my name, email, and website in this browser for the next time I comment.By using this form you agree with the storage and handling of your data. * Δdocument.getElementById( "ak_js_1" ).setAttribute( "value", ( new Date() ).getTime() );Tech dedicated news site to equip you with all tech related stuff.I agree that my submitted data is being collected and stored.✉️ Send us an emailTechToday © 2023. All Rights Reserved.TechToday.co is a technology blog and review site specializing in providing in-depth insights into the latest news and trends in the technology sector.TechToday © 2023. All Rights Reserved.Be the first to know the latest updatesI agree that my submitted data is being collected and stored.



This post first appeared on VedVyas Articles, please read the originial post: here

Share the post

Building a Batch Data Pipeline with Athena and MySQL | by 💡Mike Shakhomirov | Oct, 2023

×

Subscribe to Vedvyas Articles

Get updates delivered right to your inbox!

Thank you for your subscription

×