2019年3月1日 星期五

[ Py DS ] Ch4 - Visualization with Matplotlib (Part1)

Source From Here 


Source From Here 
Preface 
We’ll now take an in-depth look at the Matplotlib tool for visualization in Python. Matplotlib is a multiplatform data visualization library built on NumPy arrays, and designed to work with the broader SciPy stack.It was conceived by John Hunter in 2002, originally as a patch to IPython for enabling interactive MATLAB-style plotting via gnuplot from the IPython command line. IPython’s creator, Fernando Perez, was at the time scrambling to finish his PhD, and let John know he wouldn’t have time to review the patch for several months. John took this as a cue to set out on his own, and the Matplotlib package was born, with version 0.1 released in 2003. It received an early boost when it was adopted as the plotting package of choice of the Space Telescope Science Institute (the folks behind the Hubble Telescope), which financially supported Matplotlib’s development and greatly expanded its capabilities. 

One of Matplotlib’s most important features is its ability to play well with many operating systems and graphics backends. Matplotlib supports dozens of backends and output types, which means you can count on it to work regardless of which operating system you are using or which output format you wish. This cross-platform, everything-to-everyone approach has been one of the great strengths of Matplotlib. It has led to a large userbase, which in turn has led to an active developer base and Matplotlib’s powerful tools and ubiquity within the scientific Python world. 

In recent years, however, the interface and style of Matplotlib have begun to show their age. Newer tools like ggplot and ggvis in the R language, along with web visualization toolkits based on D3js and HTML5 canvas, often make Matplotlib feel clunky and old-fashioned. Still, I’m of the opinion that we cannot ignore Matplotlib’s strength as a well-tested, cross-platform graphics engine. Recent Matplotlib versions make it relatively easy to set new global plotting styles (see “Customizing Matplotlib: Configurations and Stylesheets” on page 282), and people have been developing new packages that build on its powerful internals to drive Matplotlib via cleaner, more modern APIs—for example, Seaborn (discussed in “Visualization with Seaborn” on page 311), ggplot, HoloViews, Altair, and even Pandas itself can be used as wrappers around Matplotlib’s API. Even with wrappers like these, it is still often useful to dive into Matplotlib’s syntax to adjust the final plot output. For this reason, I believe that Matplotlib itself will remain a vital piece of the data visualization stack, even if new tools mean the community gradually moves away from using the Matplotlib API directly. 

General Matplotlib Tips 
Before we dive into the details of creating visualizations with Matplotlib, there are a few useful things you should know about using the package. 

Importing matplotlib 
Just as we use the np shorthand for NumPy and the pd shorthand for Pandas, we will use some standard shorthands for Matplotlib imports: 
  1. import matplotlib as mpl  
  2. import matplotlib.pyplot as plt  
The plt interface is what we will use most often, as we’ll see throughout this chapter. 

Setting Styles 
We will use the plt.style directive to choose appropriate aesthetic styles for our figures. Here we will set the classic style, which ensures that the plots we create use the classic Matplotlib style: 
  1. plt.style.use('classic')  
Throughout this section, we will adjust this style as needed. Note that the stylesheets 
used here are supported as of Matplotlib version 1.5; if you are using an earlier version of Matplotlib, only the default style is available. For more information on stylesheets, see “Customizing Matplotlib with style sheets and rcParams” 

show() or No show()? How to Display Your Plots 
A visualization you can’t see won’t be of much use, but just how you view your Matplotlib plots depends on the context. The best use of Matplotlib differs depending on how you are using it; roughly, the three applicable contexts are using Matplotlib in a script, in an IPython terminal, or in an IPython notebook. 

Plotting from a script 
If you are using Matplotlib from within a script, the function plt.show() is your friend. plt.show() starts an event loop, looks for all currently active figure objects, and opens one or more interactive windows that display your figure or figures. So, for example, you may have a file called myplot.py containing the following: 
- myplot.py 
  1. import matplotlib.pyplot as plt  
  2. import numpy as np  
  3.   
  4. x = np.linspace(010100)  
  5. plt.plot(x, np.sin(x))  
  6. plt.plot(x, np.cos(x))  
  7. plt.show()  
You can then run this script from the command-line prompt, which will result in a window opening with your figure displayed: 


The plt.show() command does a lot under the hood, as it must interact with your system’s interactive graphical backend. The details of this operation can vary greatly from system to system and even installation to installation, but Matplotlib does its best to hide all these details from you. 

One thing to be aware of: the plt.show() command should be used only once per Python session, and is most often seen at the very end of the script. Multiple show() commands can lead to unpredictable backend-dependent behavior, and should mostly be avoided. 

Plotting from an IPython shell 
It can be very convenient to use Matplotlib interactively within an IPython shell (see Chapter 1). IPython is built to work well with Matplotlib if you specify Matplotlib mode. To enable this mode, you can use the %matplotlibmagic command after starting ipython: 
  1. In [1]: %matplotlib  
  2. Using matplotlib backend: TkAgg  
  3.   
  4. In [2]: import matplotlib.pyplot as plt  
At this point, any plt plot command will cause a figure window to open, and further commands can be run to update the plot. Some changes (such as modifying properties of lines that are already drawn) will not draw automatically; to force an update, use plt.draw(). Using plt.show() in Matplotlib mode is not required. 

Plotting from an IPython notebook 
The IPython notebook is a browser-based interactive data analysis tool that can combine narrative, code, graphics, HTML elements, and much more into a single executable document. Plotting interactively within an IPython notebook can be done with the %matplotlib command, and works in a similar way to the IPython shell. In the IPython notebook, you also have the option of embedding graphics directly in the notebook, with two possible options: 
• %matplotlib notebook will lead to interactive plots embedded within the notebook
• %matplotlib inline will lead to static images of your plot embedded in the notebook

For this book, we will generally opt for %matplotlib inline

Figure 4-1. Basic plotting example 

Saving Figures to File 
One nice feature of Matplotlib is the ability to save figures in a wide variety of formats. You can save a figure using the savefig() command. For example, to save the previous figure as a PNG file, you can run this: 
  1. In [21]: fig.savefig('my_figure.png')  
We now have a file called my_figure.png in the current working directory. To confirm that it contains what we think it contains, let’s use the IPython Image object to display the contents of this file (Figure 4-2): 


In savefig(), the file format is inferred from the extension of the given filename. Depending on what backends you have installed, many different file formats are available. You can find the list of supported file types for your system by using the following method of the figure canvas object: 
  1. In [5]: fig.canvas.get_supported_filetypes()  
  2. Out[5]:  
  3. {'ps''Postscript',  
  4. 'eps''Encapsulated Postscript',  
  5. 'pdf''Portable Document Format',  
  6. 'pgf''PGF code for LaTeX',  
  7. 'png''Portable Network Graphics',  
  8. 'raw''Raw RGBA bitmap',  
  9. 'rgba''Raw RGBA bitmap',  
  10. 'svg''Scalable Vector Graphics',  
  11. 'svgz''Scalable Vector Graphics'}  
Note that when saving your figure, it’s not necessary to use plt.show() or related commands discussed earlier. 

Two Interfaces for the Price of One 
A potentially confusing feature of Matplotlib is its dual interfaces: a convenient MATLAB-style state-based interface, and a more powerful object-oriented interface. We’ll quickly highlight the differences between the two here. 

MATLAB-style interface 
Matplotlib was originally written as a Python alternative for MATLAB users, and much of its syntax reflects that fact. The MATLAB-style tools are contained in the pyplot (plt) interface. For example, the following code will probably look quite familiar to MATLAB users (Figure 4-3): 

Figure 4-3. Subplots using the MATLAB-style interface 

It’s important to note that this interface is stateful: it keeps track of the “current” figure and axes, which are where all plt commands are applied. You can get a reference to these using the plt.gcf() (get current figure) and plt.gca() (get current axes) routines. 

While this stateful interface is fast and convenient for simple plots, it is easy to run into problems. For example, once the second panel is created, how can we go back and add something to the first? This is possible within the MATLAB-style interface, but a bit clunky. Fortunately, there is a better way. 

Object-oriented interface 
The object-oriented interface is available for these more complicated situations, and for when you want more control over your figure. Rather than depending on some notion of an “active” figure or axes, in the object-oriented interface the plotting functions are methods of explicit Figure and Axes objects. To re-create the previous plot using this style of plotting, you might do the following (Figure 4-4): 
  1. # First create a grid of plots  
  2. # ax will be an array of two Axes objects  
  3. fig, ax = plt.subplots(2)  
  4.   
  5. # Call plot() method on the appropriate object  
  6. ax[0].plot(x, np.sin(x))  
  7. ax[1].plot(x, np.cos(x));  
Figure 4-4. Subplots using the object-oriented interface 

For more simple plots, the choice of which style to use is largely a matter of preference, but the object-oriented approach can become a necessity as plots become more complicated. Throughout this chapter, we will switch between the MATLAB-style and object-oriented interfaces, depending on what is most convenient. In most cases, the difference is as small as switching plt.plot() to ax.plot(), but there are a few gotchas that we will highlight as they come up in the following sections. 

Simple Line Plots 
Perhaps the simplest of all plots is the visualization of a single function y = f(x) . Here we will take a first look at creating a simple plot of this type. As with all the following sections, we’ll start by setting up the notebook for plotting and importing the functions we will use: 
  1. %matplotlib inline  
  2. import matplotlib.pyplot as plt  
  3. plt.style.use('seaborn-whitegrid')  
  4. import numpy as np  
For all Matplotlib plots, we start by creating a figure and an axes. In their simplest form, a figure and axes can be created as follows (Figure 4-5): 
  1. fig = plt.figure()  
  2. ax = plt.axes()  
Figure 4-5. An empty gridded axes 

In Matplotlib, the figure (an instance of the class plt.Figurecan be thought of as a single container that contains all the objects representing axes, graphics, text, and labels. The axes (an instance of the class plt.Axesis what we see above: a bounding box with ticks and labels, which will eventually contain the plot elements that make up our visualization. Throughout this book, we’ll commonly use the variable name fig to refer to a figure instance, and ax to refer to an axes instance or group of axes instances. 

Once we have created an axes, we can use the ax.plot function to plot some data. Let’s start with a simple sinusoid (Figure 4-6): 
  1. fig = plt.figure()  
  2. ax = plt.axes()  
  3. x = np.linspace(0101000)  
  4. ax.plot(x, np.sin(x));  
Figure 4-6. A simple sinusoid 

Alternatively, we can use the pylab interface and let the figure and axes be created for us in the background (Figure 4-7): 
Figure 4-7. A simple sinusoid via the object-oriented interface 

If we want to create a single figure with multiple lines, we can simply call the plot function multiple times (Figure 4-8): 
Figure 4-8. Over-plotting multiple lines 

That’s all there is to plotting simple functions in Matplotlib! We’ll now dive into some more details about how to control the appearance of the axes and lines. 

Adjusting the Plot: Line Colors and Styles 
The first adjustment you might wish to make to a plot is to control the line colors and styles. The plt.plot() function takes additional arguments that can be used to specify these. To adjust the color, you can use the colorkeyword, which accepts a string argument representing virtually any imaginable color. The color can be specified in a variety of ways (Figure 4-9): 
  1. plt.plot(x, np.sin(x - 0), color='blue') # specify color by name  
  2. plt.plot(x, np.sin(x - 1), color='g') # short color code (rgbcmyk)  
  3. plt.plot(x, np.sin(x - 2), color='0.75') # Grayscale between 0 and 1  
  4. plt.plot(x, np.sin(x - 3), color='#FFDD44') # Hex code (RRGGBB from 00 to FF)  
  5. plt.plot(x, np.sin(x - 4), color=(1.0,0.2,0.3)) # RGB tuple, values 0 and 1  
  6. plt.plot(x, np.sin(x - 5), color='chartreuse'); # all HTML color names supported  

Figure 4-9. Controlling the color of plot elements 

If no color is specified, Matplotlib will automatically cycle through a set of default colors for multiple lines. Similarly, you can adjust the line style using the linestyle keyword (Figure 4-10): 
  1. plt.plot(x, x + 0, linestyle='solid')  
  2. plt.plot(x, x + 1, linestyle='dashed')  
  3. plt.plot(x, x + 2, linestyle='dashdot')  
  4. plt.plot(x, x + 3, linestyle='dotted');  
  5. # For short, you can use the following codes:  
  6. plt.plot(x, x + 4, linestyle='-') # solid  
  7. plt.plot(x, x + 5, linestyle='--') # dashed  
  8. plt.plot(x, x + 6, linestyle='-.') # dashdot  
  9. plt.plot(x, x + 7, linestyle=':'); # dotted  
Figure 4-10. Example of various line styles 

If you would like to be extremely terse, these linestyle and color codes can be combined into a single nonkeyword argument to the plt.plot() function (Figure 4-11): 
  1. plt.plot(x, x + 0'-g') # solid green  
  2. plt.plot(x, x + 1'--c') # dashed cyan  
  3. plt.plot(x, x + 2'-.k') # dashdot black  
  4. plt.plot(x, x + 3':r'); # dotted red  
Figure 4-11. Controlling colors and styles with the shorthand syntax 

These single-character color codes reflect the standard abbreviations in the RGB (Red/Green/Blue) and CMYK (Cyan/Magenta/Yellow/blacK) color systems, commonly used for digital color graphics. There are many other keyword arguments that can be used to fine-tune the appearance of the plot; for more details, I’d suggest viewing the docstring of the plt.plot() function using IPython’s help tools. 

Adjusting the Plot: Axes Limits 
Matplotlib does a decent job of choosing default axes limits for your plot, but sometimes it’s nice to have finer control. The most basic way to adjust axis limits is to use the plt.xlim() and plt.ylim() methods (Figure 4-12): 
  1. plt.plot(x, np.sin(x))  
  2. plt.xlim(-111)  
  3. plt.ylim(-1.51.5);  
Figure 4-12. Example of setting axis limits 

If for some reason you’d like either axis to be displayed in reverse, you can simply reverse the order of the arguments (Figure 4-13): 
  1. plt.plot(x, np.sin(x))  
  2. plt.xlim(100)  
  3. plt.ylim(1.2, -1.2);  
Figure 4-13. Example of reversing the y-axis 

A useful related method is plt.axis() (note here the potential confusion between axes with an e, and axis with an i). The plt.axis() method allows you to set the x and y limits with a single call, by passing a list that specifies [xmin, xmax, ymin, ymax] (Figure 4-14): 
  1. plt.plot(x, np.sin(x))  
  2. plt.axis([-111, -1.51.5]);  
Figure 4-14. Setting the axis limits with plt.axis 

The plt.axis() method goes even beyond this, allowing you to do things like automatically tighten the bounds around the current plot (Figure 4-15): 
  1. plt.plot(x, np.sin(x))  
  2. plt.axis('tight');  
Figure 4-15. Example of a “tight” layout 

It allows even higher-level specifications, such as ensuring an equal aspect ratio so that on your screen, one unit in x is equal to one unit in y (Figure 4-16): 
  1. plt.plot(x, np.sin(x))  
  2. plt.axis('equal');  
Figure 4-16. Example of an “equal” layout, with units matched to the output resolution 

For more information on axis limits and the other capabilities of the plt.axis() method, refer to the plt.axis() docstring. 

Labeling Plots 
As the last piece of this section, we’ll briefly look at the labeling of plots: titles, axis labels, and simple legends. Titles and axis labels are the simplest such labels—there are methods that can be used 
to quickly set them (Figure 4-17): 
  1. plt.plot(x, np.sin(x))  
  2. plt.title("A Sine Curve")  
  3. plt.xlabel("x")  
  4. plt.ylabel("sin(x)");  
Figure 4-17. Examples of axis labels and title 

You can adjust the position, size, and style of these labels using optional arguments to the function. For more information, see the Matplotlib documentation and the docstrings of each of these functions (plt.titleplt.xlabel and plt.ylabel). 

When multiple lines are being shown within a single axes, it can be useful to create a plot legend that labels each line type. Again, Matplotlib has a built-in way of quickly creating such a legend. It is done via the (you guessed itplt.legend() method. Though there are several valid ways of using this, I find it easiest to specify the label of each line using the label keyword of the plot function (Figure 4-18): 
  1. plt.plot(x, np.sin(x), '-g', label='sin(x)')  
  2. plt.plot(x, np.cos(x), ':b', label='cos(x)')  
  3. plt.axis('equal')  
  4. plt.legend();  
Figure 4-18. Plot legend example 

As you can see, the plt.legend() function keeps track of the line style and color, and matches these with the correct label. More information on specifying and formatting plot legends can be found in the plt.legend() docstring; additionally, we will cover some more advanced legend options in “Customizing Plot Legends” on page 249. 

Matplotlib Gotchas 
While most plt functions translate directly to ax methods (such as plt.plot() → ax.plot(), plt.legend() → ax.legend(), etc.), this is not the case for all commands. In particular, functions to set limits, labels, and titles are slightly modified. For transitioning between MATLAB-style functions and object-oriented methods, make the following changes: 
• plt.xlabel() → ax.set_xlabel()
• plt.ylabel() → ax.set_ylabel()
• plt.xlim() → ax.set_xlim()
• plt.ylim() → ax.set_ylim()
• plt.title() → ax.set_title()

In the object-oriented interface to plotting, rather than calling these functions individually, it is often more convenient to use the ax.set() method to set all these properties at once (Figure 4-19): 
  1. ax = plt.axes()  
  2. ax.plot(x, np.sin(x))  
  3. ax.set(xlim=(010), ylim=(-22), xlabel='x', ylabel='sin(x)', title='A Simple Plot');  
Figure 4-19. Example of using ax.set to set multiple properties at once

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...