LOADING

Wait For a Moment...

CS50 AI Project0

These are the project tasks affiliated with Harvard CS50. Check it out.

Degrees

Task description: Write a program that determines how many 
“degrees of separation” apart two actors are.

# for example, when gets these inputs below, results would be like that
$ python degrees.py large
Loading data...
Data loaded.
Name: Emma Watson
Name: Jennifer Lawrence
3 degrees of separation.
1: Emma Watson and Brendan Gleeson starred in Harry Potter and the Order of the Phoenix
2: Brendan Gleeson and Michael Fassbender starred in Trespass Against Us
3: Michael Fassbender and Jennifer Lawrence starred in X-Men: First Class

Background

This problem is originated from Six Degrees of Kevin Bacon game, saying that anyone in Hollywood film industry can be connected to Kevin Bacon within 6 steps, where each step consists of finding a film that two actors both starred in.

Actually, there maybe exist multiple ways to connect two actors, but in this problem, it just considers the shortest path. In detail, its purpose is to choose a sequence of movies with least length to connect two stars. For example, if the shortest path between Jennifer Lawrence and Tom Hanks is 2: Jennifer Lawrence is connected to Kevin Bacon by both starring in “X-Men: First Class”, and Kevin Bacon is connected to Tom Hanks by both starring in “Apollo 13”.

What is the State and the Action?

Here, each person can be regarded as a state, and a movie can be regarded as action. In other words, a movie could connect one person to another one. And the problem is to find the shortest path, so breadth-first search is the best option.

Understanding

Directory: Large.people.csv Each person with a unique id has his name and birth. stars.csv Each row is a pair of a person_id value and movie_id value, indicating person with person_id stars in the movie with movie_id. movies.csv It contains movie_id, titlt and the year that the movies was released.

Directory: Small. The same as Large, but with small database.

degrees.py

Variables:

  • names(dict) | name -> person_ids {maybe people have the same name}
  • people(dict) | person_id -> name, birth, movies(set)
  • movies(dict) | movie_id -> title, year, stars(set)

Functions:

  • load_data: loads data from people.csv, stars.csv and movies.csv

  • main: print the result of one of the shortest path (if it has), or return None when there exists no shortest path.

  • shortest_path: returns the shortest list of (movies, person_id) pairs

  • person_id_for_name: returns person_id of the person’s name

  • neighbor_for_person: returns (movies, person_id) pairs that relates to the given person

Use BFS to find the shortest path

There are two points that we need to pay attention to. First, we cannot visit the node that we have visited, this is because maybe it could cause circle making this loop never end and it absolutely is not the shortest path when we revisit one node. Second, when we backtrack, the order of the result is the opposite, just reverse it.

def shortest_path(source, target):
    """
    Returns the shortest list of (movie_id, person_id) pairs
    that connect the source to the target.

    If no possible path, returns None.
    """
    Q = QueueFrontier()
    root = Node(state=source, parent=None, action=None)
    target_leave = Node(state=None, parent=None, action=None)
    Q.add(root)

    _find = False
    path = list()
    unordered_map = dict()
    while (not Q.empty()) and (not _find):
        top_Node = Q.remove()
        if top_Node.state not in unordered_map:
            unordered_map[top_Node.state] = 1
        # find the top node's neighbors
        neighbors = neighbors_for_person(top_Node.state)
        for neighbor in neighbors:
            movie_id, person_id = neighbor
            if person_id not in unordered_map:
                child = Node(state=person_id, parent=top_Node, action=movie_id)
                if (child.state == target):
                    target_leave = child
                    _find = True
                    break
                Q.add(child)
    if target_leave.state == None:
        return None
    else:
        while target_leave.parent != None:
            # append ("movie_title", "person_name")
            path.append((target_leave.action, target_leave.state))
            target_leave = target_leave.parent
    path.reverse()
    return path