Cryptocurrency Price Prediction Using News and Social Media

In the past few years, there have been a variety of experiments utilizing sentiment from twitter to predict price fluctuations of the price of Bitcoin. This study by three Stanford researchers sets itself apart from other studies in that it uses news data in addition to social media data. Furthermore, the study analyses percent change rather than text sentiment and creates price predictions for Ethereum and Litecoin as well as bitcoin.

The data inputs that were used in this study were the daily price data for the three coins from Kaggle. The second input was from a scraping tool built by the three Stanford researchers to pull about 34,000 cryptocurrency related article headlines and tweets in total.

Supervised machine learning algorithms were used with 60%, 20%, 20% for train, development and test respectively. Both the feature extraction was performed and the classification model was implemented using the machine learning library scikit-learn. The model was built using a simple logistic regression model and the classifier output was aggregated for each day for each coin. The prediction for would be based on the majority of binary labels (‘0’ indicated a decrease in price and ‘1’ indicated an increase in price).

The machine learning model was able to predict half of the price increases correctly and on average, was able to predict the price decreases of higher magnitude on Bitcoin. As for Ethereum, the model worked very well in predicting price increases as around 76% of the days were predicted accurately. On the other hand, only about 16% price decreases were predicted for this same instrument. Overall, the model was able to predict larger price increases as well as decreases using non-technical data.

The researchers plan to expand on this experimentation to continue to improve the performance of the model. Specifically, they will be investigating different strategies for labeling training data, integrating additional types of media and much more.

To read the full research paper by Connor Lamon, Eric Nielsen, Eric Redondo , click here.

Want to get started on creating your own machine learning algorithm? Check out Intro to Machine Learning in Less than 50 lines of Code.


Get your free API token and start algo trading today! To generate your token:

  1. Register for a free practice account here.
  2. Once the platform has loaded,  click on the account ID in the upper right corner.
  3. Click token management and generate your token.


Risk Warning: The FXCM Group does not guarantee accuracy and will not accept liability for any loss or damage which arise directly or indirectly from use of or reliance on information contained within the webinars. The FXCM Group may provide general commentary which is not intended as investment advice and must not be construed as such. FX/CFD trading carries a risk of losses in excess of your deposited funds and may not be suitable for all investors. Please ensure that you fully understand the risks involved.

Demo Account: Although demo accounts attempt to replicate real markets, they operate in a simulated market environment. As such, there are key differences that distinguish them from real accounts; including but not limited to, the lack of dependence on real-time market liquidity, a delay in pricing, and the availability of some products which may not be tradable on live accounts. The operational capabilities when executing orders in a demo environment may result in atypically, expedited transactions; lack of rejected orders; and/or the absence of slippage. There may be instances where margin requirements differ from those of live accounts as updates to demo accounts may not always coincide with those of real accounts.