Are you a Data Engineer? a Data Scientist? Want to build a Career in Data Engineering? Love Open Source?
Great!!
But you can still read this even if you don't fall under the above categories.
Eight years into the Information technology Industry, but where was my start??
I have been associated with Data from the beginning of my career,when I started as an intern at a CTRM startup. I quite liked the job of writing python scripts to scrape data and automate workflows. Things started getting tough when I entered the world of ETL tools(just too many vendors and hasn't changed much till date).
Last couple of years have been tougher with "Cloud" and "Open Source" playing a vital role in Architecture decisions. With data having to go through various phases - Ingestion, profiling, quality, validation, transformation, warehouse, data Lake, visualization, monitoring, analysis, observability etc.. Even the Data tools have been evolving to cater to different phases of data.
The Open Source Data tools list is growing ✈️ each day. There are many popular open source data tools used or known to most of us. But there are these unpopular, functionality specific yet powerful tools which become goto tools for a specific problem in the field of data. Personally I couldn't get a firm point on the tool of choice as I had to juggle through various sources to understand and had to have a vision for many tools before I could make a decision.
I bring back myself to the prepared start and presenting this blog "Moolaa". Which translates to "Source" in my local language Kannada, where I would introduce you to some cool, useful and innovative open source data projects and some use cases where they fit right.
I plan to introduce at least two new projects a month and also a comparison between tools for a specific functionality.
Note - I will not include the products until I have used them myself.
But open to try and take suggestions from anyone if they have come
across any new & useful products.
Here I take a pause...
Call it a preface to many more learnings ahead. With the best inputs and practices picked up from here lets step towards having a greater knowledge towards Data Engineering.
LEARN, SHARE AND GROW