Words Can Change the World!
Driving Insurance Industry Innovation with a Data Lake on AWS with Databricks Delta Lake
Architected a Data Lake to store over 600GB of historical and incremental data, implementing change data capture (CDC) from relational databases across three data layers—Raw, Staging, and Curated—to ensure robust governance and observability.
7 min readBuilding an ETL pipeline with Airflow and ECS
Running an ETL pipeline on ECS with Airflow is a great way to scale your data processing. In this article, we will go through the steps to set up an ETL pipeline using Airflow and ECS.
6 min readEvaluating Large Language Models on Code Generation
In this study, the performances in Python code generation of three different code generation models – CodeT5, CodeGen, and GPT-3.5 – were compared using the Mostly Basic Python Problems (MBPP) dataset. The pass@k metric was used as the primary method of evaluation, and CodeT5 and CodeGen were evaluated in a few-shot setting, while GPT-3.5 was evaluated in zero-shot and few-shot settings.
20 min readDiscord notification using CloudWatch Alarms, SNS and AWS Lambda
Running an ETL pipeline on ECS with Airflow is a great way to scale your data processing. In this article, we will go through the steps to set up an ETL pipeline using Airflow and ECS.
4 min read
All Articles
Unveiling the Power of Multi-Layer Feed-Forward Networks in Text Classification
Nov 17, 2023Exploring the Efficacy of Simple Models and Feature Engineering Techniques in Text Classification
Oct 10, 2023Going Bastion-less: Accessing Private EC2 instance with Session Manager
Nov 2, 2020Automating Lambda modules deployment with GitLab CI
Aug 9, 2020Building an ApiGateway-SQS-Lambda integration using Terraform
Jun 22, 2020Text Classifier with Multiple Outputs and Multiple Losses in Keras
May 9, 2020