Unfolding the universe of possibilities..

Dancing with the stars of binary realms.

Large Language Models: TinyBERT — Distilling BERT for NLP

Large Language Models: TinyBERT — Distilling BERT for NLP Unlocking the power of Transformer distillation in LLMs Introduction In recent years, the evolution of large language models has skyrocketed. BERT became one of the most popular and efficient models allowing to solve a wide

Collecting Data with Apache Airflow on a Raspberry Pi

A Raspberry Pi is All You Need Continue reading on Towards Data Science »

Detection of Multicollinearity in Data sets using Statistical Testing.

Detecting multicollinearity in data sets is an important step but also challenging. Continue reading on Towards Data Science »

PyrOSM: working with Open Street Map data

Efficient geospatial manipulations for OSM map data Photo by Tabea Schimpf on Unsplash If you’ve worked with OSM data before, you know it’s not the easiest to extract. OSM data can be huge, and finding performant solutions for what you want to

Python for Data Engineers

Advanced ETL techniques for beginners Continue reading on Towards Data Science »

Nine Rules to Formally Validate Rust Algorithms with Dafny (Part 2)

Lessons from Verifying the range-set-blaze Crate By Carl M. Kadie and Divyanshu Ranjan This is Part 2 of an article formally verifying a Rust algorithm using Dafny. We look at rules 7 to 9: 7. Port your Real Algorithm to Dafny.8. Validate

CountVectorizer to Extract Features from Texts in Python, in Detail

Everything you need to know to use CountVectorizer efficiently in Sklearn Continue reading on Towards Data Science »

5 Ideas to Foster Data Scientists/Analysts Engagement Without Suffocating in Meetings

The author shares strategies they have implemented to strike this balance successfully Continue reading on Towards Data Science »

Understanding Retention with Gradio

How to leverage web applications for analytics Image by DALL-E 3 I remember a moment when I built my first web application. It was around eight years ago, and I was a rather junior analyst and was convinced that BI tools

The Untold Side of RAG: Addressing Its Challenges in Domain-Specific Searches

Using hybrid search, hierarchical ranking, and instructor embedding to address similar domain-specific documents in our RAG setup Continue reading on Towards Data Science »