Cryptocurrency Price Prediction Using News and Social Media

In the past few years, there have been a variety of experiments utilizing sentiment from twitter to predict price fluctuations of the price of Bitcoin. This study by three Stanford researchers sets itself apart from other studies in that it uses news data in addition to social media data. Furthermore, the study analyses percent change rather than text sentiment and creates price predictions for Ethereum and Litecoin as well as bitcoin.

The data inputs that were used in this study were the daily price data for the three coins from Kaggle. The second input was from a scraping tool built by the three Stanford researchers to pull about 34,000 cryptocurrency related article headlines and tweets in total.

Supervised machine learning algorithms were used with 60%, 20%, 20% for train, development and test respectively. Both the feature extraction was performed and the classification model was implemented using the machine learning library scikit-learn. The model was built using a simple logistic regression model and the classifier output was aggregated for each day for each coin. The prediction for would be based on the majority of binary labels (‘0’ indicated a decrease in price and ‘1’ indicated an increase in price).

The machine learning model was able to predict half of the price increases correctly and on average, was able to predict the price decreases of higher magnitude on Bitcoin. As for Ethereum, the model worked very well in predicting price increases as around 76% of the days were predicted accurately. On the other hand, only about 16% price decreases were predicted for this same instrument. Overall, the model was able to predict larger price increases as well as decreases using non-technical data.

The researchers plan to expand on this experimentation to continue to improve the performance of the model. Specifically, they will be investigating different strategies for labeling training data, integrating additional types of media and much more.

To read the full research paper by Connor Lamon, Eric Nielsen, Eric Redondo , click here.

