If you need to answer a question, you need knowledge about the subject of the question. If I ask you, which Indian sweet does Pranav Gurung like? You will first ask, “Who is Pranav Gurung?”. He is a student in SSSIHL. Ok, fine. So, how will you know what sweet does he like? You will perhaps ask him. What if you don’t have the luxury because he is busy working on some coffee experiment in his bioscience lab.

Then, you will have to ask for something second best. Suppose I tell you, there was a feast in the hostel (SSSIHL) and many different kinds of sweets were distributed and each person was asked his name and the sweet he wanted to eat. Fantastic! Now we can just look up Pranav’s name from the list and find out which sweet he likes!

But the problem is that, only half the hostel came for the feast as it was right after sports meet and everyone was tired, including Pranav, who sleeps for most of his time alive.

So, we don’t have the record called “Pranav Gurung chose […] sweet”. But we do have the other people who did have the sweet. Maybe if we can find out the similarities between the people who like a certain kind of sweet and see which exact group of people Pranav is most likely to lie in, we may be able to answer this question with some level of confidence.

But there were 300 people who attended the feast. So, doing it manually will definitely not cut it. In such a case we can just use a classification technique maybe.

So, did you note? I used the word “classification technique” at the end of the apparently final statement of this silly story. Before that, it was all just cock and bull story about who is who, what is what and so on. But the moment we knew exactly what we have and what we wish to achieve, experience will spin out a solution from our repository of skills in data science.