Certification Exam Resources
I’ve put together a collection of resources I’ve used while preparing for various certification exams – covering everything from AWS to Databricks. The focus is mainly on data engineering and machine learning topics.
I’ve put together a collection of resources I’ve used while preparing for various certification exams – covering everything from AWS to Databricks. The focus is mainly on data engineering and machine learning topics.
In this blog post, I’m sharing my experience taking the Databricks Generative AI Associate exam – from study notes to resources that made a difference. Whether you’re just starting your prep or looking for extra insights, this guide will help you find the right resources to get prepared.
Exploring ways of handling irregular and sudden bursts of multiple files for data processing using event driven architecture on AWS. This blog posts showcases how to use S3 notification with EventBridge to trigger a Glue Workflow that has number of events and batch window trigger conditions.
I’m thrilled to present pytransflow, a Python library I developed in my free time. pytransflow simplifies record-level processing through transformation flows defined in YAML files. I hope you find this library engaging and that it sparks your interest to both use and contribute to it.
Exploring the implementation of Lua scripting for dynamically altering API requests in an Nginx Reverse Proxy. This investigation opens up possibilities to write and run dynamic content using Lua scripts directly within the Nginx server, making it a powerful tool for web applications.
Sharing my AWS Data Engineer Associate exam experience, resources, tricks and tips, study notes etc. I hope this post will give you some insights and help you find resources to prepare for the exam.
A brief guide outlining the process of setting up and running S3 Batch Operations Jobs with Lambda integration.
Here, I present a compilation of notes and practical scenarios drawn from my experiences, demonstrating the effective utilization of pytest fixtures. These examples provide valuable insights into leveraging fixtures to refine and improve the architecture of your testing module.
Exploring different strategies for fine-tuning the output file size in AWS Glue and consolidating small files during post-processing. By implementing these techniques, you’ll not only enhance the efficiency of Athena queries but also significantly reduce the cost associated with querying large da...
Exploring the functionality of AWS CloudWatch alarms, understanding their operation, configuration, and practical application within CDK applications. Learn to define and customize alarms, including adjusting periods, evaluation ranges, and handling missing data, to ensure robust monitoring and e...
The objective of this project is to develop a system enabling scientists to automate numerical calculations on remote clusters and build an internal database of calculation outcomes. It also involves training machine learning models on these calculations and seamlessly integrating them for numeri...
Discover the fundamentals of JWT authentication and its advantages within distributed systems and microservice architectures. Explore the integration of authentication into the FARM stack, consisting of FastAPI, React, and MongoDB, utilizing JSON Web Tokens (JWT).
Discover a comprehensive guide on configuring your local machine for Python projects. This guide provides an overview of the most commonly used tools throughout the development process.
Gain insight into libraries and compile OpenMPI, BLAS, LAPACK, ScaLAPACK, NetCDF, Flook, SIESTA, and other utilities from source. Understand the process of building these libraries to customize your environment effectively.
Sharing my experience, resources, tricks, and study notes from my AWS Data Analytics Specialty exam. I hope this post will give you some insights and help you find resources to prepare for the exam.
Exploring the PGAS paradigm and experimenting with coarrays in Fortran. Learning about the principles behind PGAS, Fortran coarrays and its applications in parallel programming.
Delve into the fundamentals of systemd, covering dependencies, unit files, and service configuration. Explore the process of configuring custom applications as systemd services. Learn how to efficiently manage and run applications using systemd within your system.
Set up GitLab CI/CD locally for easier experimentation and testing. Investigate methods for creating nested parent-child pipelines and explore the process and advantages of implementing this approach. Learn how to streamline your development workflow with nested pipelines for better organization ...
Discover techniques for deploying custom models within Docker images using SageMaker and serverless inference. Explore the functionalities and benefits of each approach. Learn how to efficiently deploy your models for scalable and efficient inference.
Dive into the structure of popular Big Data file formats like Parquet, Avro, and ORC. Understand their unique features and advantages. Learn how these formats optimize data storage and processing.
Create a simple application that utilizes DynamoDB Stream, Lambda, and S3 services. Set it up locally for easy development, testing, and experimentation. This setup demonstrates how these AWS services can work tosgether.
Understand the basics of seq2seq architecture and artificial neural networks (ANNs). Learn about multilingual models and their applications. Discover how to use these technologies to achieve scalable multilingual semantic search.
Discover how to handle and process DICOM files. Explore popular free and open-source libraries that can help you develop applications for efficient DICOM processing. These tools and libraries make managing medical images much easier and straightforward.
Explore the fundamentals of the DICOM file format! This quick introduction covers the basics of DICOM’s structure, its essential uses, and tips for easily navigating its complex and abstract components.