PinnedRajeshinAWS TipProcess small json blob files using firehose and lambdaWe have blob files generated from different devices arriving in a S3 at a certain time of the day. Each of these files ranges from…3 min read·Aug 26, 2022----
RajeshinAWS TipProcess small json blob files using firehose and lambda using batch setup-Part-2In our previous blog post, we explored the topic of utilizing Lambda Firehose for handling small blob files. If you haven’t had a chance to…5 min read·Sep 18, 2023----
RajeshinAWS TipConnect to aws document-db cluster from mongodb-compassWe know aws document db works under a private vpc and it does not support a public endpoint which means we can’t connect directly to an…3 min read·Dec 4, 2022----
RajeshMerging small parquet files in aws lambdaI was working on a use case where We need to capture logs from datascience model .So we were getting many small files from kinesis…2 min read·Oct 20, 2021----
RajeshDockerize Spark Jobs with Databricks Container ServicesMany times as a developer after changing code in spark jobs ,To test the changes inside databricks cluster we need to follow quite a number…2 min read·Jul 16, 2021--1--1
RajeshImplementation of cdc in spark using delta fileMany time when I work with kafka I feel tempted to use kafka to store the data but it should never be used as datastore instead we should…3 min read·May 24, 2021----
RajeshImplement SCD Type 2 via Spark Data FramesWhile working with any data pipeline projects most of times programmer deals with slowly changing dimension data .3 min read·May 7, 2021--2--2