Solving A Real-Life Problem With Search Engines

In an interview with BBC World at Chinese bloggers debate Google, I said:

I wish somebody would take the position of the typical Chinese internet user. If one is going to advocate a boycott, I would like the criteria to be the material improvement in the life of the typical Chinese internet user.  I think talk of boycotting Google is a bad idea. People in China will not appreciate that because these are esoteric issues for them.
 
There are a number of search engines and there are many different ways of searching. People want more choice. Don't tell them they are free by advocating a boycott.  I conducted a little test. I searched for mention of the circumstances under which a supplement called Bingdian (Freezing Point) was recently banned in China. The editor of this supplement had written a letter of complaint.  Any mention of this on the local Baidu search engine has disappeared. In fact, when you put a banned search term in, the engine shuts down. If you put in a term like June 4 [the date of the Tiananmen Square massacre] the result is "Not Found". And then you can't search again for 30 minutes. It's a very upsetting experience.  But with Google.cn there are different ways of finding things. You can try any number of subtle combinations. Google gives you more opportunities to triangulate.

In like manner, Keso asked: 

当你不得不在有原则的与经过阉割的之间做一个选择的时候,你会选择哪一个?

[When you are forced to make a choice between the principled "No" and the castrated "Yes", which one would you rather choose?]

In the following, I am going to run through an exercise about a real-life problem (as opposed to "toy" problems with general terms such as "freedom" or "democracy").  I suggest that if you really want to find out about freedom or democracy, the Internet is not a good place.  You will get bits and pieces of contradictory information and you will have a hard time sorting it out or systematizing it.  So instead you should go to a bookstore and there are any number of decent books that present the issues in a systematic way.

The Internet's most usual function is to track down the details of special topics or suddenly breaking events that you just got wind of and you want more information for.  In the following is an example based upon a major breaking event of international significance -- the shutdown of China Youth Daily's Freezing Point Weekly Magazine.  The Chinese media have gone silent under orders, and "Freezing Point (冰点)" is a sensitive term at blogs and forums.  The notice from the Central Propaganda Department referred specifically to the essay titled "Modernization and History Textbooks" by Yuan Weishi published on January 11, 2006 (see History Textbooks in China).

Suppose I want to locate the original essay through a search engine in order to read the full text to think about it.  I don't want some western newspaper's one-paragraph summary, and I would rather think for myself.

Here are the four keywords:

The searches below were done by myself and other contributors inside China.  These results were obtained for one moment in time, so you may see something quite different at a different time or place.  What do you expect?  This is China.  Some bureaucrat may have read this blog post and immediately taken action!  I resent being paranoid but that does not mean its't true ...


Exercise #1: Input the title of the essay "现代化与历史教科书" into Baidu, Google.cn and Google.com

Baidu (22,400 results in 0.050 seconds).  The top result goes directly to the original China Youth Daily page and it still works.

Google.cn (3,270 results in 0.13 seconds)  The top result goes to a copy of the article posted at People.com.cn (the website for People's Daily) and it still works.

Google.com (233,000 results in 0.11 seconds from Beijing).  The top result goes directly to the original China Youth Daily page.

Google.com (244,000 results in 0.32 seconds from Hong Kong; technical note: even though I entered the simplified characters, an automatic conversion based upon my preferences took place and the search was conducted in traditional as well as simplified characters)  The top results goes to People.com.cn and the next result goes to the original China Youth Daily page.

In conclusion, the original essay can be obtained by any of these search engines if the essay title is used (but look for the udpate at the bottom of the page!!!).


Exercise #2:  Suppose I don't know the title of the essay itself.  I only know the name of the author (Yuan Weishi), the newspaper (China Youth Daily) and the weekly magazine (Freezing Point).  So I input the three terms together 袁伟时+中国青年报+冰点.

Baidu (1,540 pages in 0.048 seconds)  The listings are mostly critiques of the essay, but the original essay cannot be located in this manner.

Google.cn (237 pages in 0.13 seconds)  These are critiques of Yuan's essays plus a lot of comments on some other unrelated matter.  Actually, the most commonly cited critique (就义和团运动的一些史实与袁伟时先生商榷) is very detailed and scholarly and not the "angry young people"-assault at all.  I even want to see to see a response from Yuan because the rebuttal is very persuasive!  It was not about theory but it cited what specific individuals at specific places at specific times in specific cited historical documents (for example, about when the Boxers actually began to attack train station).  Well, either the documents were faked or else there were other contradictory documents -- I'd love to know because it not only explains the case but also illuminates on how to read such documents.  There are also lessons here about regarding the Boxers (or the Communists, for that matter), as the cited documents showed that the Boxers were not a homogenous group; instead, there were different sects within and they did not always agree with each other on objectives, strategies and tactics.

Google.com (1,310 pages in 0.44 seconds when executed in Hong Kong)  The top result is an anti-Communist newspaper, the second result is an attack on the author for being a Chinese traitor and the third result is a China Times (Taiwan) piece on the Freezing Point shutdown.  The fourth result (ChinaEWeekly) carries a copy of the original essay. 

Google.com (0 results when executed in Beijing, China)  Why zero?  If the same list was obtained as from Hong Kong, then the top result is sufficient to trigger off alarm bells somewhere.

In conclusion, for the pragmatic person who is oriented towards obtaining results, the original essay cannot be found from inside China with this combination of keywords.


So what are the lessons?

Observation #1: The 'truth' is out there -- there are many copies of that essay out there.  Some are official publications (e.g. China Youth Daily, People.com, the portals, etc); some are forum posts; some are blog posts; they are inside and outside of China.  Even this blog carries a copy of the essay.  You just have to find one of the copies.

Observation #2: You will get different results depending on your choice of keywords.

Observation #3: You will get different results depending on your choice of search engines, because they have different coverage, ranking algorithms, commercial advertisements and filtering rules.

So the strategies are obvious.

Strategy #1: Choose your keywords smartly.  Given the highly published brouhaha, "Yuan Weishi"+"Freezing Point" is probably toxic whereas "Modernization"+"History Textbook" has to be acceptable.  I mean, they couldn't ban either "modernization" or "history" or "textbook" from the Internet, could they?  And if you didn't know the title, you can get it easily from those critiques -- they will list the title of the 'shameless' essay written by the 'Chinese traitor' (see the top Baidu result), and then you are on your way.

Strategy #2: Start off with your favorite search engine.  If you can't get anywhere, use another one and then another one if you want to get the job done.  The more choices, the better your odds.

A regular user of search engines, inside or outside of China, should know all this already.


Update # 1:  A friend from Anhui got these two results one day later.  The essay title will not bring up the original essay at either China Youth Daily or People.com.cn.  Things have changed in a matter of days.  I have this unverified theory -- within the first 48 hours of a suddenly breaking event, everything is out there and you should save everything that you regard as valuable because the censors take that much time to realize what is happening and then react ... I cannot prove this, but you should take this as your operational standard.

(Google.cn)  (45,900 pages in 0.15 seconds)  You will like the irony ... the top result is a Chinese Wikipedia entry on "Problems With Japanese History Textbooks"!  And Wikipedia is not even accessible from China!

(Google.cn) (1,600 pages in 0.20 seconds) As before, there are only reports about the shutdown but not the original essay.

Update # 2 (email from Beijing):

While searching without a proxy server from Beijing, receiving search results that link to dajiyuan.com, peacehall.com, and other dissident sites are insufficient to trigger the "This document contains no data." response. For example, searching google.com for Gao Zhisheng (高智晟) also brings up a full page of links that are inaccessible from within China: Epoch Times, Renminbao, Boxun, Radio Free Asia.

That's exactly what is so nice about having google.com. It still indexes pages that are inaccessible from inside the GFW. Once you have the link, you can use a proxy server to get there.

What gets me a "This document contains no data" error? Well, from here, not Jiang Zemin, Bloody case of Shanwei, Zhao Ziyang, June 4th, Falun Gong (all in Chinese)... So, basically nothing. Reproducing your friend's Freezing Point search on google.com still brings up the Epoch Times and a bunch of other blocked links.  So, it could be due to regional variation on Chinese Internet censorship. That's not unheard of; blogspot.com was temporarily unblocked only in Beijing last fall. It could also be that "no data" was a crude temporary stopgap, and when the sky did not come crashing down because of Freezing Point, someone decided to open it back up again.