To any future hackers: You don't need OCR. HQ has a simple websocket server that will stream the questions and possible answers in real time. Set up an http proxy on your phone to inspect the requests the app is making. You'll find lots of helpful stuff.
Author here: At mux, we experimented using machine learning to predict HQ Trivia answers. We managed to get 80-90% accuracy across a dataset of around 500 questions.
The trickiest questions were relational questions (e.g. What's heavier, a pineapple or a Siamese cat?). Would appreciate any feedback on our approach (and happy to answer questions!).
I thought this was going to be a retrospective from the HQ Trivia team about how they were mediocre given the scaling challenges and hiccups they are facing and then they solved it through ML!
This seems pretty misleading, since honestly 99% of the machine learning that goes on here happens when running the questions/answers through Google Search. There are probably millions of man-years of machine learning / information retrieval that have gone into Google Search.
Then we find HQ eventually pivots to being a machine learning research platform once someone invents a perfectly scoring bot ;-)
Joking aside, I'd say HQ Trivia are getting savvier with the questions. A final question the other day was along the lines of "Which two female artists collectively have the same number of Grammys as Beyoncé?" with the answer being "Adele + Madonna", I believe.
Is there a dataset of past HQ questions and answers?