Pantip.com is the de facto discussion forum about everything in Thailand, and there is no close second. Its sub forums include topics ranging from food, Thai television dramas to photograph, finance, and politics. Think: Like Reddit, but anyone - from your parents to your friends occasionally go on it. Pantip is 4th most trafficked website in Thailand, after Google, Facebook, and Youtube, making it the most visited Thai website.
Rajdumnern, one of Pantip's sub forums, houses the discussion of politics and current events in Thailand. Because of its active community, and relatively anonymous membership, Rajdumnern is often at the crosshair of political censorship. Having said that, Pantip is the only consistently available and collectable data of the
general online (underground..?) public’s mood at each point in time, and I think in many ways it is the closest tool to a time machine which allows us to take a peek back into the past, and relive the feelings Thai people experience as nation. Thailand has been through a lot - experiencing 19 coups d'état since the transition from absolute monarchy to constitutional monarchy in 1932, with the most recent coup in May 2014. There is a saying that Thai people forget easily, meaning Thai people as a nation never learn from their past mistakes, and because of that, we are stuck in a cycle of political polarization. Going with the idea that Pantip provides a reflection of the past, I collected the posts from Rajdumnern, and made a tool for everyone to explore the trends of different topics, and get a brief idea how things were over the past few years.
To collect the data, I built a scraper to gather every post from Rajdumnern created after Pantip upgraded the website in December 2012 to the present day. I had wanted to collect the comments, but that proved to be too many to be worth the effort and time - I had more than 1.6 million comments before giving up. The total number of posts came to more than ~220,000.
The graph above shows the number of posts per day across the time frame mentioned earlier. You might notice that the spikes in number of daily posts correspond to different important political events in Thailand, and you would be right! For example, there were large numbers of posts between November '13 and June '14, corresponding to the political crisis in that period. Below the graph is the context window. Resizing it will allow a more detailed view of the graph from the time period, and will also show you the top posts ranked by comment count and upvote count from that period in the table above.
The red markers are in place of spikes of number of daily posts. These spikes are determined by an algorithm, so some of them may be missing or look a bit out of place. If you think about it, spikes are easy to determine visually, but there isn't a clear quantitative definition of a spike. If you are interested in more the technical details behind this project, I'll be writing it up on here. To give more contexts to these spikes, we should try to understand the current events at each point in time. To do that, I grab ten news headlines from Google News with keyword ‘การเมือง’ (‘politics’ in Thai) on each date of spike. You can see the news to the right of the graph, and if you hover over the red markers, the news will change to the corresponding date.
Next, we can dive a little bit deeper to see how popular a topic is overtime. In this case, we are interested in the proportion of posts containing a certain keyword per day. Basically, any post with the keyword in its title or body is counted, and divided by the total number of posts from that day. Like before, the spikes and news are also shown here. Notice that there is input box below for you to put in any keywords you want to query. Play around and see if you can find an interesting keyword or trends!
Now you might be wondering how popular different topics are relative to each other. As before, we look at the proportion of each keyword over time. We'll be looking at different keywords side by side, and like before you can put in your own keywords.
Let's see how recent political figures fare against each other.
Thai politics is pretty volatile, and 'hairy' if you look at the major characters. Professor Darren Zook at Berkeley, during his lecture on Thailand, said Thai politics is like a Korean drama, and I couldn't have agreed more! If you are interested, here's a great resource for becoming familiar with Thai politics.
I present to you the colors of Thai politics. In Thailand, the color you wear may be used to inferred your political affiliation. It's not that big of an issue now, but there was definitely a period when you had to think twice about the color you wear for the day. No joke! There are the red 'แดง' shirts and the yellow 'เหลือง' shirts. The yellow shirts are original movement that opposed Thaksin and his party. After the 06 coup, come the red shirts who are basically the supporter of Thaksin, Yingluck, and Puea Thai Party. Since then other colors have come and go, and usually are opposed to the red shirts. I also put in multi-color 'สลิ่ม' as a keyword, referring to the various colors that come and go, normally with a poor connotation.
We are just scratching the surface of how we can explore and analyze the data from Pantip. I had thought about applying natural language processing, and doing some sentiment analysis, but it is very difficult due to the structure of Thai language (Please let me know if you know a good way to do this). I think the data as it stands is too rudimentary to apply the fancy analytical methods to talk about things within the area of Thai politics. This is an exciting start; there are many other ideas to explore! I'll be sure to collect more data from Pantip, so we can talk about something else other than politics. Stay tuned!