Blog

Become a Story Telling Ninja: Present Data Science Models to Stakeholders

Story Telling Ninja
A Data Science Ninja

Over the last few years, I have presented a lot of projects involving Data Science models, Natural Language Processing to be more precise, to various stakeholders and leadership teams. While it’s super important to convey your technicalities, results and all the hard work you have put in building the Data Science models, visualizations, etc., what’s more important is how you convey those things! In this blog, we will talk about the art of story telling!

Continue reading “Become a Story Telling Ninja: Present Data Science Models to Stakeholders”

Machine Learning Roadmap: An Effective Guidebook for Learning Machine Learning

Sometime back, I wrote about why I have started mentoring and the scope of fields in which I am interested in mentoring. Not long ago, I broadened my horizon further to include Machine Learning in the scope. Hence, I have prepared a Machine Learning Roadmap which is quite hands-on, will nudge you to pursue your curiosity and, is supposedly less boring and more intriguing – which reduces the overall probability of dropping off in comparison to video lecture courses out there. The roadmap is designed for you to learn Machine Learning in a step-wise manner. It would equip you to learn and gain hands-on experience in streams like data cleaning, data visualization, exploratory data analysis, data modeling, etc.

In the blog, I will give you a walkthrough of the Machine Learning Roadmap, how to download it and would encourage you to write or talk to me if you need help. In the end, you can give me a feedback on how did the Machine Learning Roadmap helped you, and how can it be further improved!

Continue reading “Machine Learning Roadmap: An Effective Guidebook for Learning Machine Learning”

Evaluating Effectiveness of Mentorship: Open Sourcing My Framework

A while ago, I wrote about opening up my calendar for mentorship. Soon, quite a few people talked to me about career switches, interviews for university admissions, life in & career opportunities in Singapore, etc. I eventually asked all of them to score my mentoring. While the conversations are definitely very subjective, I decided that the scoring could be objective. I have decided to open source the evaluation criteria that I use for gauging how effective I am as a mentor. Besides, I will also put up and be very transparent about the scores that I receive, where am I lagging and where I am doing good.

Disclaimer: Even though I realize that feedback is very important, it’s voluntary and optional exercise for the mentees and I do not nudge anyone repeatedly to fill up my evaluation. Thus, I will keep on updating the chart as and when I get more responses.

You will find the evaluation criteria structured in the following way:

  1. Whether I answered all the questions satisfactorily, to-the-point, and in a structured way?
  2. Questions around:
    • Ease of approach
    • Content Expertise
    • Clear and comprehensive speaking
    • Any bias or prejudice while mentoring
    • Professional integrity
  3. Did I refer any 3rd party material or a subject-matter-expert?
  4. Whether I am worthy of being referred?
  5. Overall score and my ROTI (KPI)
  6. Subjective Feedback
Continue reading “Evaluating Effectiveness of Mentorship: Open Sourcing My Framework”

What’s This About?

I have opened up my calendar for anyone to book mentoring hours for career, university applications and other guidance. Mentoring is something, that I was doing informally over accepting requests in meetups and via LinkedIn. I am proud to say that I have uptil now, guided more than 15 people with their dilemmas over SOP, University Selection, Career Progression, Life & Career opportunities in Singapore, and on few other topics. In general, I am open to discuss and learn Data Science, Product Management, Startups. If there’s something I can’t help you with, I will most definitely connect you with someone who can help you and has the required know-how. So, feel free to book an appointment here Book an appointment with Shubhanshu Gupta using SetMore

Why: I am doing this?

It’s an inexplicable feeling: someone who can guide you, who has been through the exact same phase as you are going through, someone who has a subject matter expertise, someone who can provide a different perspective, or someone who has connections in the area you are looking to break through.

Believe me when I say that I have had a good fortune of having that ‘someone’ (from here on, the ‘someone’ will be addressed as ‘mentor’) in various phases of my life. Thus, this is my way of giving back to the community which blessed me with wonderful people who mentored me.

Some General Notes

If we have never met or talked before, please send me a brief introductory mail or a note over LinkedIn. My contact details are there in the Contact page.

I live in Singapore and the timings in the booking calendar reflect Singapore Time Zone. So, please take note of that and book meeting hours by adjusting your own timezone accordingly. Having said that, I have absolutely no problem in considering requests seeking change in timings, as per mutual convenience.

Wish to Mentor?

I have recently open sourced the criteria that I use to evaluate the effectiveness of my mentoring abilities. If you feel that you too would like to provide mentorship, please reach out to me. Nothing would be better than diversifying the portfolio of mentors on this portal.

In case you have any feedback for me, please feel free to reach out.

Become a Web Analytics Ninja: Analyze Bounce Rate Across Different Visitor Segments

Web analytics
Web Analytics

This post is a digression from the other data science blogs that I have written in the past and more so, from the work that I do in my day-to-day job. Well, I don’t mean digression in a negative connotation. I enjoyed and learnt so much that I implemented many of the strategies in my own website. In this post, I will be discussing how I did a deep dive on a 1 liner problem statement by my client, “The bounce rate has gone up since last few months from what it was before, Why?” That may seem trivial to investigate and analyze, but the lack of details and granularity, made the problem statement very broad and open ended. Not enough clarity also makes it pretty easy to hit a roadblock very early in the process, especially when you don’t know where to start. फ़िक्र न करें (Fear not)! You will see a structured way to approach such kind of problem statement.

Continue reading “Become a Web Analytics Ninja: Analyze Bounce Rate Across Different Visitor Segments”

Speed Up Pandas Dataframe Apply Function to Create a New Column

pandas
Pandas Library

Data cleaning is an essential step to prepare your data for the analysis. While cleaning the data, every now and then, there’s a need to create a new column in the Pandas dataframe. It’s usually conditioned on a function which manipulates an existing column. A strategic way to achieve that is by using Apply function. I want to address a couple of bottlenecks here:

  • Pandas: The Pandas library runs on a single thread and it doesn’t parallelize the task. Thus, if you are doing lots of computation or data manipulation on your Pandas dataframe, it can be pretty slow and can quickly become a bottleneck.
  • Apply(): The Pandas apply() function is slow! It does not take the advantage of vectorization and it acts as just another loop. It returns a new Series or dataframe object, which carries significant overhead.

So now, you may ask, what to do and what to use? I am going to share 4 techniques that are alternative to Apply function and are going to improve the performance of operation in Pandas dataframe.

Continue reading “Speed Up Pandas Dataframe Apply Function to Create a New Column”

Collocations in NLP using NLTK Library

Collocation in NLTK

Collocations are phrases or expressions containing multiple words, that are highly likely to co-occur. For example – ‘social media’, ‘school holiday’, ‘machine learning’, ‘Universal Studios Singapore’, etc.

Continue reading “Collocations in NLP using NLTK Library”

Salads

Since quite sometime now, I have developed a new love for Salads. No, not the mundane ones like the one below

salad
Cucumber Tomato Salad

These are quite boring, and make you feel like you are forcing/punishing yourself on a so-called-healthy diet. In fact, I never felt full eating just these and thus, I mostly ate Indian vegetarian in office. Although, I craved for vegetarian/vegan food in other cuisines but most of my experiments led to horrible (got chicken instead of tofu at a famous Thai place) or tasteless disasters. However, restaurants in Central Business District (popularly knows as CBD) area in Singapore gave me a whole new penchant for salads. The best part about salads – you get Asian, European, Mexican, and probably 100 more varieties. Super filling, super tasty and you can get them vegetarian or vegan, as you like! I am going to share photos of some of the best salads I have had, and an easy recipe for you to try at home.

Continue reading “Salads”

Saturday Kids: Code in the Community Experience

Kids of 8-10 years of age are incredibly smart who are treading high on the curve of curiosity and learning. Thus, it’s equally challenging to teach such kids. Did I just write challenging? Did I not mention that I feel a strange pull for anything challenging? Jokes apart, in June I came across an opportunity to teach Python/Scratch to kids in Singapore. The program briefed a 10 week Code in the Community program run by Saturday Kids in collaboration with Google. This post is an account of my experience and learnings throughout these 10 weeks with Saturday Kids.

Continue reading “Saturday Kids: Code in the Community Experience”

Time Series Analysis using Pandas

Time series, a series of data points ordered in time. Pretty intuitive, isn’t it? Time series analysis helps in businesses in analyzing past data, predict trends, seasonality, and numerous other use cases. Some examples of time series analysis in our day to day lives include:

  • Measuring weather
  • Measuring number of taxi rides
  • Stock prediction

In this blog, we will be dealing with stock market data and will be using Python 3, Pandas and Matplotlib.

Continue reading “Time Series Analysis using Pandas”