So you are told to analyse some dataset. The first two questions you must ask are:
Let us get into the data that we are concerned about in this course.
Movies and Shows - the cast, the crew, the ratings, production date and the works - almost everything about every movie in the world is with IMDb. For non-commercial and personal use a subset of the data is made available.
Here datasets.imdbws.com. In this case the dataset is a HTTP URL which have multiple links to datasets inside such as datasets.imdbws.com/name.basics.tsv.gz
We know where the data is and some high-level idea about what the data is all about. However, at this stage we must have a clear idea about each element in the dataset - which means the documentation. We have a good documentation provided by IMDb for the data set above here www.imdb.com/interfaces/