#!/usr/bin/env python
# coding: utf-8

# # Overview
# 
# This week is about getting familiar with networks, and we'll focus on four main elements
# 
# * Basic mathematical description of networks
# * The `NetworkX` library 
# * Matplotlib, binning, and plotting degree distributions
# * Random networks

# # Part 1: Basic mathematical description of networks
# 
# This week, let's start with some lecturing. I love networks, so I'll take some time time today to tell you about them.
# 
# > **_Video Lecture_**. Start by watching the "History of Networks". 
# 

# In[4]:


from IPython.display import YouTubeVideo
YouTubeVideo("qjM9yMarl70",width=800, height=450)


# > **_Video Lecture_**. Then check out a few comments on "Network Notation". 

# In[5]:


YouTubeVideo("MMziC5xktHs",width=800, height=450)


# > _Reading_. We'll be reading the textbook _Network Science_ (NS) by Laszlo Barabasi. You can download the whole 
# > thing for free [**here**](http://barabasi.com/networksciencebook/). You can also [download all chapters as a single file](https://www.dropbox.com/s/j1jy86lznz6sz1e/book.pdf?dl=0)
# > 
# > * Read chapter 1\.
# > * Read chapter 2\.
# > 

# > _Exercises_  
# > _Chapter 1_ (Don't forget that you should be answering these in an IPython notebook.) 
# > 
# > * List three different real networks and state the nodes and links for each of them.
# > * Tell us of the network you are personally most interested in. Address the following questions:
# >   * What are its nodes and links? 
# >   * How large is it? 
# >   * Can be mapped out? 
# >   * Why do you care about it? 
# > * In your view what would be the area where network science could have the biggest impact in the next decade? Explain your answer - and base it on the text in the book. 
# > 
# > _Chapter 2_
# > 
# > * Section 2.5 states that real networks are sparse. Can you think of a real network where each node has _many_ connections? Is that network still sparse? If yes, can you explain why?
# > 
# > There are more questions on Chapter 2 below.
# > 

# # Part 2: The awesome `NetworkX` library
# 
# `NetworkX` should already be installed as part of your _Anaconda_ Python distribution. But you don't know how to use it yet. The best way to get familiar is to work through a tutorial. That's what the next exercise is about
# 
# > _Exercises_:
# > 
# > * Go to the `NetworkX` project's [tutorial page](https://networkx.github.io/documentation/stable/tutorial/index.html). The goal of this exercise is to create your own `notebook` that contains the entire tutorial. You're free to add your own (e.g. shorter) comments in place of the ones in the official tutorial - and change the code to make it your own where ever it makes sense. 
# > * Go to NS Section 2.12: Homework, then
# >   * Write the solution exercise **2.1** (the 'Königsberg Problem') from NS in your `notebook`. 
# >   * Solve exercise **2.3** ('Graph representation') from NS using `NetworkX` in your `notebook`. (You don't have to solve the last sub-question about *cycles of length 4* ... but I'll be impressed if you do it).
# >   * Solve exercise **2.5** ('Bipartite Networks') from NS using `NetworkX` in your `notebook`.
# > 

# > **_Video Lecture_**: Once again, it's time to stop working for a couple of minutes to hear me talk about plotting with `NetworkX`. 
# > 

# In[6]:


YouTubeVideo("iDlb9On_TDQ",width=800, height=450)


# # Part 3: Plotting degree distributions
# 
# As always we'll learn about degree-distribution plotting by creating a `notebook` and trying it out. 
# > 
# > _Exercises_:
# > 
# > Begin by importing the right packages. Start by importing `matplotlib.pyplot` (for plotting), `numpy` (for binning and other stuff), 
# > `random` (for generating random numbers), and `networkx` (for generating networks.)
# > 
# > * Binning real numbers
# >   * Let's do a gentle start and use the `random` library generate 5000 data points from a Gaussian distribution with $\mu = 2$ and $\sigma = 0.125$.
# >   * Now, let's use `numpy.histogram` to bin those number into 10 bins. What does the `numpy.histogram` function return? Do the two arrays have the same length?
# >   * Then we use `matplotlib.pyplot.plot` to plot the binned data. You will have to deal with the fact that the counts- and bin-arrays have different lengths. Explain how you deal with this problem and why.
# > * Binning integers
# >   * But binning real numbers into a fixed number of bins is easy when `numpy.histogram` does all the work and finds the right bin boundaries for you. 
# > Now we'll generate a bunch of integers and set the bin boundaries manually. This time, let's grab data from a Poisson distribution. As it turns out 
# > `numpy` also has some convenient random number generators. Use `numpy.random.poisson` to generate 5000 numbers drawn from a Poisson distribution 
# > characterized by $\lambda = 10$. Find the maximum and minimum value of your 5000 random numbers. 
# >   * Instead of simplify specifying the number of bins for `numpy.histogram`, let's specify the bins we want using a vector. 
# > Create a vector $v$
# > that results in a binning that puts each integer value in its own bin and where the first bin contains the minimum number you found above, 
# > and the last bin contains the maximum number. 
# > Use the vector by setting `numpy.histogram`'s `bin` parameter as `bin = ` $v$. 
# > What is the sum over bin counts? Explain how the binning-vectors first and last element 
# > relates to the min and max from the Poisson distribution.
# >   * Now, use a bar chart (`matplotlib.pyplot.bar`) to plot the distribution 
# > * Binning and plotting degree distributions. 
# >   * Let's generate the Erdös-Renyi (ER) network which has a degree distribution that matches the Poisson distribution above. 
# > First we have to figure out which values the ER parameters (_N_ and _p_) should assume. It's easy to see that $N = 5000$, but how do you find $p$? 
# > **Hint**: The parameter $\lambda$ in the Poisson distribution corresponds to the average degree, so you have to find a $p$ that results in an average 
# > degree, $k = 10$. And you know that $\langle k \rangle = p (N-1)$, which will give you $p$. Note that Python by default returns the result of divisions as the most precise of the datatypes involved (for instance, try computing `1/2` and `1.0/2.0` in your notebook). If you want division to always give you a decimal number, you can enter `from __future__ import division` at the beginning of your notebook.
# >   * Now, use `networkx` to create the graph and extract the degree distribution.
# >   * Finally, create a nice bar plot of the degree distribution, including axes labels and a plot title. Make sure that it looks like the Poisson distribution you plotted above.
# > 

# # Part 4: Random networks
# 
# > **_Video Lecture_**. Now it's time to relax and watch a few minutes of info on _Random Networks_. 

# In[7]:


YouTubeVideo("c_SbQCzgqb0",width=800, height=450)


# > _Reading_. Read section 3.1-3.7 (the most important part is 3.1-3.4) of Chapter 3 of _Network Science_. You can find the entire book [**here**](http://barabasi.com/networksciencebook/).
# 

# > _Exercises_ (should be completed in a `notebook`):
# > 
# > * Work through NS exercise 3.1 ('Erdős-Rényi Networks'). The exercise can be found in Section 3.11: Homework.
# > * Paths. Plot a random network with 200 nodes and an average degree of 1.5\. (I suggest using `networkx.draw` and reading [the documentation](https://networkx.github.io/documentation/networkx-1.10/reference/drawing.html) carefully to get an overview of all the options and what they look like. For example, you may want to shrink the node size).
# >   * Extract the Giant Connected Component, GCC. (Hint. You can use `networkx.connected_component_subgraphs`)
# >   * Choose a node at random from the GCC. (Hint: You may want to try `random.choice`.)
# >   * Find all nodes that are precisely 2 steps away from that node. (Hint. I suggest `networkx.single_source_shortest_path_length`)
# >   * Plot the GCC with the following choice of colors. Starting node _black_ (`"#000000"`). The nodes 2 steps away _red_ (`"#ff0000"`). All other nodes _blue_ (`"#A0CBE2"`). Again, I suggest using `networkx.draw()` and reading [the documentation](https://networkx.github.io/documentation/networkx-1.10/reference/drawing.html) carefully find out how to color individual nodes.
# > 
# 
# ![Sune's Network](https://raw.githubusercontent.com/suneman/socialgraphs2017/master/files/edge_colormap.png)

# In[ ]: