Prophet的安装与应用

本文第一部分先讲解在虚拟机上搭建python/prophet环境,后面简单介绍下prophet的使用

准备Vagrant环境

创建一个Ubuntu虚拟机:

mkdir ubuntu
cd ubuntu
vagrant init hashicorp/precise64
vagrant up

检查下vagrant是否正常运行:

vagrant ssh

编辑当前目录下新生成的Vagrantfile

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.box_check_update = false
  config.vm.network "private_network", ip: "192.168.7.7"
  config.vm.synced_folder "/home/me/projects", "/shared/projects"
  config.vm.provider "virtualbox" do |vb|
      vb.memory = "4096"
  end
  config.vm.provision :shell, path: "bootstrap.sh"
end

其中的以下几项比较重要:

  1. config.vm.network 配置虚拟机的IP地址,理论上任何内网IP都可用
  2. config.vm.synced_folder 配置虚拟机可读写的目录
  3. vb.memory 配置内存大小。fbprophet至少要用2G内存。配置为2G时,安装有可能会失败。
  4. config.vm.provision :shell, path: 配置初始化脚本。

初始化脚本,即bootstrap.sh内容如下。注意这里假设你已经下载了Anaconda的安装文件,并放在Vagrantfile同一目录下,安装文件须要重命名为Anaconda.sh

#!/usr/bin/env bash

echo "Installing anaconda ..."
conda_installer=Anaconda.sh
cd /vagrant

chmod +x $conda_installer
./$conda_installer -b -p /opt/anaconda

echo "Configuring path ..."
cat >> /home/vagrant/.bashrc << END
# add for anaconda install
export PATH=/opt/anaconda/bin:\$PATH
# set locale
export LC_ALL="en_US.utf8"
END

export PATH=/opt/anaconda/bin:\$PATH


echo "Setting timezone ..."
cp /usr/share/zoneinfo/Asia/Hong_Kong /etc/localtime

echo "Installing tushare and fbprophet"
pip install tushare
pip install fbprophet

执行初始化脚本

vagrant provision

等待安装完成后,重启虚拟机,并再次进入虚拟机终端

vagrant reload
vagrant ssh

使用prophet

首先进入Vagrant虚拟机

vagrant ssh

启动jupyter notebook

mkdir /shared/projects/play_prophet
cd /shared/projects/play_prophet
jupyter notebook --no-browser

启动成功后会看到如下信息

[I 15:26:46.475 NotebookApp] Serving notebooks from local directory: /shared/projects/play_prophet
[I 15:26:46.476 NotebookApp] 0 active kernels 
[I 15:26:46.476 NotebookApp] The Jupyter Notebook is running at: http://192.168.7.7:8888/?token=1ed2eeef32e04bbf3cc5b06a0820e2fe58f3994ab91362eb
[I 15:26:46.477 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 15:26:46.477 NotebookApp] 
    
    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://192.168.7.7:8888/?token=1ed2eeef32e04bbf3cc5b06a0820e2fe58f3994ab91362eb

下面我们来测看下fbprophet对股票收盘价的预测:

import tushare as ts
import pandas as pd
import numpy as np
from fbprophet import Prophet 
import matplotlib.pyplot as plt

用tushare随便取支股票的复权后数据:

hdata = ts.get_h_data('002041', start='2005-01-01')

清理下数据,只要收盘价与日期

hdata.sort_index(inplace=True)
hdata.reset_index(inplace=True)
hin = hdata[['date', 'close']]
hin = hin.iloc[0:-5]

按prophet的要求重命名字段

hin = hin.rename(columns={'date':'ds', 'close': 'y'})
hin['y'] = np.log(hin['y'])

下面就是使用prophet预测了

p = Prophet()
p.fit(hin)

freq不同参数值的含义可以参考这里

future = p.make_future_dataframe(periods=10, freq='B')
forecast = p.predict(future)

p.plot(forecast)

Prophet prediction

对比下结果

pred = forecast.loc[:, ['yhat', 'ds', 'yhat_lower', 'yhat_upper']]
pred['predicted'] = np.exp(pred['yhat'])
pred['predicted_lower'] = np.exp(pred['yhat_lower'])
pred['predicted_upper'] = np.exp(pred['yhat_upper'])

predicted = pred.tail(10).rename(columns={'ds':'date'})[['date', 'predicted', 'predicted_lower', 'predicted_upper']]
real = hdata.tail(5)[['date','close']]
pd.merge(predicted, real, on=['date'])

可见预测值与真实值的差异还是很大的。

    date	    predicted	predicted_lower	predicted_upper	close
0	2017-03-23	17.195041	14.301807	    20.919123	    17.34
1	2017-03-24	17.156144	14.284265	    20.564156	    17.25
2	2017-03-27	17.288668	14.453228	    20.825827	    16.14
3	2017-03-28	17.306230	14.184511	    20.776071	    15.77
4	2017-03-29	17.354448	14.364676	    21.059795	    15.62
Comment