本文第一部分先讲解在虚拟机上搭建python/prophet环境,后面简单介绍下prophet的使用。
准备Vagrant环境
创建一个Ubuntu虚拟机:
mkdir ubuntu
cd ubuntu
vagrant init hashicorp/precise64
vagrant up
检查下vagrant是否正常运行:
vagrant ssh
编辑当前目录下新生成的Vagrantfile
,
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/trusty64"
config.vm.box_check_update = false
config.vm.network "private_network", ip: "192.168.7.7"
config.vm.synced_folder "/home/me/projects", "/shared/projects"
config.vm.provider "virtualbox" do |vb|
vb.memory = "4096"
end
config.vm.provision :shell, path: "bootstrap.sh"
end
其中的以下几项比较重要:
config.vm.network
配置虚拟机的IP地址,理论上任何内网IP都可用config.vm.synced_folder
配置虚拟机可读写的目录vb.memory
配置内存大小。fbprophet至少要用2G内存。配置为2G时,安装有可能会失败。config.vm.provision :shell, path:
配置初始化脚本。
初始化脚本,即bootstrap.sh
内容如下。注意这里假设你已经下载了Anaconda的安装文件,并放在Vagrantfile
同一目录下,安装文件须要重命名为Anaconda.sh
。
#!/usr/bin/env bash
echo "Installing anaconda ..."
conda_installer=Anaconda.sh
cd /vagrant
chmod +x $conda_installer
./$conda_installer -b -p /opt/anaconda
echo "Configuring path ..."
cat >> /home/vagrant/.bashrc << END
# add for anaconda install
export PATH=/opt/anaconda/bin:\$PATH
# set locale
export LC_ALL="en_US.utf8"
END
export PATH=/opt/anaconda/bin:\$PATH
echo "Setting timezone ..."
cp /usr/share/zoneinfo/Asia/Hong_Kong /etc/localtime
echo "Installing tushare and fbprophet"
pip install tushare
pip install fbprophet
执行初始化脚本
vagrant provision
等待安装完成后,重启虚拟机,并再次进入虚拟机终端
vagrant reload
vagrant ssh
使用prophet
首先进入Vagrant虚拟机
vagrant ssh
启动jupyter notebook
mkdir /shared/projects/play_prophet
cd /shared/projects/play_prophet
jupyter notebook --no-browser
启动成功后会看到如下信息
[I 15:26:46.475 NotebookApp] Serving notebooks from local directory: /shared/projects/play_prophet
[I 15:26:46.476 NotebookApp] 0 active kernels
[I 15:26:46.476 NotebookApp] The Jupyter Notebook is running at: http://192.168.7.7:8888/?token=1ed2eeef32e04bbf3cc5b06a0820e2fe58f3994ab91362eb
[I 15:26:46.477 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 15:26:46.477 NotebookApp]
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://192.168.7.7:8888/?token=1ed2eeef32e04bbf3cc5b06a0820e2fe58f3994ab91362eb
下面我们来测看下fbprophet对股票收盘价的预测:
import tushare as ts
import pandas as pd
import numpy as np
from fbprophet import Prophet
import matplotlib.pyplot as plt
用tushare随便取支股票的复权后数据:
hdata = ts.get_h_data('002041', start='2005-01-01')
清理下数据,只要收盘价与日期
hdata.sort_index(inplace=True)
hdata.reset_index(inplace=True)
hin = hdata[['date', 'close']]
hin = hin.iloc[0:-5]
按prophet的要求重命名字段
hin = hin.rename(columns={'date':'ds', 'close': 'y'})
hin['y'] = np.log(hin['y'])
下面就是使用prophet预测了
p = Prophet()
p.fit(hin)
freq不同参数值的含义可以参考这里
future = p.make_future_dataframe(periods=10, freq='B')
forecast = p.predict(future)
p.plot(forecast)
对比下结果
pred = forecast.loc[:, ['yhat', 'ds', 'yhat_lower', 'yhat_upper']]
pred['predicted'] = np.exp(pred['yhat'])
pred['predicted_lower'] = np.exp(pred['yhat_lower'])
pred['predicted_upper'] = np.exp(pred['yhat_upper'])
predicted = pred.tail(10).rename(columns={'ds':'date'})[['date', 'predicted', 'predicted_lower', 'predicted_upper']]
real = hdata.tail(5)[['date','close']]
pd.merge(predicted, real, on=['date'])
可见预测值与真实值的差异还是很大的。
date predicted predicted_lower predicted_upper close
0 2017-03-23 17.195041 14.301807 20.919123 17.34
1 2017-03-24 17.156144 14.284265 20.564156 17.25
2 2017-03-27 17.288668 14.453228 20.825827 16.14
3 2017-03-28 17.306230 14.184511 20.776071 15.77
4 2017-03-29 17.354448 14.364676 21.059795 15.62