Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
184 year-old Indian library goes digital, including 444 yr-old book on Alexander (nextbigwhat.com)
172 points by jayadevan on Feb 22, 2013 | hide | past | favorite | 33 comments


As a history amateur, it is painful to force myself never to probe into south indian history [of which I've heard so much, growing up there] simply because I've learned by experience that it only leads to frustration and disappointment at the lack of information and research. This work is incredible.


What "lack of history and reasearch"? This field is full of amazing works. Three great examples (there are many more): http://books.google.com/books/about/Symbols_of_substance.htm... http://books.google.com/books/about/Languages_and_Nations.ht... http://ukcatalogue.oup.com/product/9780198063124.do


Those three books talk about the 1500-1700 Nayaka rule, languages in 1700-1800 and the third book's summary says it talks about the 12th-14th centuries. That doesn't qualify as evidence of anything. Here is what I'd pay to learn about -

1) Origins, dating and translations of the literature - the sangam lit, the andal songs, etc.

2) Exactly what the fuck was happening in south india from 500 BC to 500 AD? There are Roman coins found in south india and south indian coins found in Rome. One of the Ptolemies refer to a pandian king. Pliny the younger complains about the amount of money spent on indian goods by the roman people. Apparently the romans also had a huge space reserved for indian peppers. One of the chinese explorers have some descriptions of a port in tamil nadu. All this more or less sums up what we know about this period. I know I have to cite sources but it will take a lot of time to hunt them down and I'll do it over the next few days. What else was happening? How did the people dress? Did they know about the greek and the roman ideals?

3) The pallava kingdom - Why were they so into sculptures? Why did they practice so much religious tolerance unlike the later cholas, for example? How did they come to power?

4) The chola kingdom - are the names of the kings really all that we have about the early cholas who ruled around jesus's time? How come they went out of power and then came back into power after five centuries? What were the pandavas doing when the cholas were in power? Were they hunted? Why were aditya karikala's murderers pardoned by raja raja chola? Is it truly about their caste? We have a shitload of kalvettus from this time - can we translate them all and upload them online, please?

5) Do we have anything at all to go by in terms of the food they ate at any period in history? How did the language change over the two millennia?

Edit: There simply aren't that many books published about south indian history. There isn't all that much digging up either. There is one Nilakanta Sastry who is cited by everyone who talks of Cholas, but his books have been out of publication for decades. In comparison, we have hundreds and hundreds of books published on every conceivable aspect of a number of other civilizations - about the changes in english, greek and latin over the years, the mayans, the rise and fall of rome, histories of the various european monarchies, the aztecs, etc. [all of which interest me hugely].


The problem is something like this, the government of India is currently totally disinterested in projects of this nature. What we do currently is take development work as the sole requirement and work towards that.

Indians have never shown interest towards quality research in history. Heck we don't even respect the symbols of history we have amidst us. We either demolish archeological evidences(Many demolished during new Airport road construction in Bangalore) or use them as places where lovers hang around to escape from their parents. Just look at the state of historical monuments and how badly they have been maintained, if it was not for some tourism value even those would have long vanished.

Apart from that much of the historical research comes from the archaeological work carried out during pre-independence era during British time or immediately post that. Archeological Survey of India is a joke, and much of the historical research is left to curious professors from universities who are severely underfunded.

I think even if research is carried out now, we are only likely to find out half truth and much of the story will have to reconstructed piecing things together.


but they still care about our heritage. isn't that why all the cities with 'English' names were rechristened? :p


> isn't that why all the cities with 'English' names were rechristened? :p

Interesting choice of words there - 'rechristened'. In any case, many of the names were reverted back to their original forms (not just cities, but things like surnames too).


The Keralans are pretty keen on the theory that Chinese tai qi and gong fu traditions derived from an ancient Keralan martial arts tradition, and do seem to have some evidence. There are also a lot of Chinese-style bamboo fishing nets still in use up the coast of Kerala and in to Karnataka. The only other place I've seen those is Zhejiang, north of Shanghai near Yangzhou, the old southern terminus of China's Grand Canal.

Also, the great era of Cambodia seems to have been founded by a family from south India, which is backed up by art historic evidence and the earliest Chinese diplomatic materials. The same goes for the Cham kingdom of Vietnam. That's a hell of a long way to sail, and further than the great Buddhist Borobodur monument on Java, Indonesia, which was also built by south Indians.

Another area of India I'm ultra curious about is Assam ... as everyone's going to Burma, Assam is clearly more interesting! (Written from Yunnan, just east of there, in southwest China)


Not adding anything useful to this, but the questions you've raised make me want to get into this - I've been playing around with this idea of capturing the oral histories that abound in the smaller towns in India and putting them up online for people to cross reference.


Well, ancient South India is indeed full of riddles (some of which will probably never be solved), but there is good research going on about it. e.g. a recent very interesting study: http://www.ejvs.laurasianacademy.com/SouthernRec.pdf (PDF) and a lot of work on the Sangam age by the Institut Francais De Pondichery(http://www.ifpindia.org/-Indology-.html).


I agree! I spent three months moving through the south early last year and already found some apparent explanations in the library's books on some of the inscriptions I photographed at some ancient Jain sites in Tamil Nadu, Kerala and Karnataka. Some of these are geometric and may be of interest to HN readers... http://bit.ly/120T4GI and http://bit.ly/Yk4ukt for example.


The cynic in me wants to say, "Just wait until descendants of Alexander the Great sue you for copyright infringement."

I think this effort is great. I hope more and more history can be put on line and made accessible. I am a firm believer that knowing the past is the only way to know where you are going.


"How dare you digitize books from the library of Alexandria. That's theft!"

No idea what I'm trying to say here, by the way.


You can easily download the books from here: http://statelibrary.kerala.gov.in/rarebooks/site_media/


Hailing from Kerala, really happy to see this happening. This is indeed a great achievement, and hope there will be more initiatives of such kind which will help to bring together the vast amount of information scattered around in the sub continent.


While I commend them for digitizing, I'm somewhat disappointed that they are just scanned copies instead of selectable/searchable text. I wish they made them more accessible for reference by truly converting them to text.


This is something I know about.

If you waited until stuff was selectable/searchable (i.e. transcribed) then you'd never get these types of documents online. The scanning part of this process is tough considering that either a) very decent equipment must be bought, staff must be trained, extreme care must be taken handling old documents, all this costs money and takes time; b) the scanning must be outsourced because the archive doesn't have the competencies or tech and this is done at a cost.

Archives regard documents in there special collections as assets belonging to the institution. There is a resistance to putting them online. Once they are online they can be 'stolen'. If you want to have the whole lot transcribed first it could take decades because of the sheer volume of documents and the lack of researchers and archivists (and its not really the job of an archivist to transcribe stuff, merely to synopsize for a descriptive list and catalogue). For instance, The University of London, a liberal institution founded by Jeremy Bentham, wanted to get all of Bentham's documents online. Except the guy was a prolific correspondent. It was taking them years and years to transcribe the Bentham archive and they had only gotten (I don't remember exactly but something like) 2% through it. In the end the crowd-sourced the task[1] by scanning everything, putting the scans in a wiki and letting everyone on the net have a go at transcribing the documents - they are 94% done now. And that's just one historical figure. Take into account that documents need to be semantically marked up using standards like TEI (Text Encoding Initiative; an XML format) and that researchers in these areas are not known for their techie skills and wouldn't know a programming language if it came up and bit them on the bum and you can see ...

Finally, the institution may never have done research on the documents in the archive and may want to vet everything before it goes online, or may be reluctant to 'give away' its jewels. There is a serious tension between enabling global research and respecting the 'property' of the archive. This is something that needs to be dealt with now and is what I'm a part of, at the moment we call it the digital humanities.

Hope that gives you an overview. Well done to the Kerala State Central Library.

[1] http://blogs.ucl.ac.uk/transcribe-bentham/

[2] http://www.tei-c.org/index.xml


This is something I know about, too, having once supervised this lab (in its previous incarnation as part of UVA's eText center): http://www.digitalcurationservices.org/. We used a pair of PhaseOne P40 scanning backs[1] with Hasselblad prime lenses, mounted aiming directly downward at a custom tabletop with adjustable book mounts. We used standard studio lighting (not strobes). Software-wise, for books and manuscripts we created batch jobs in Photoshop to minimally post-process the images (adjust levels & contrast, rename files, not much else). For 3D artifacts, everything was custom & manual. Files were scanned into a pair of Mac Pros for processing then burned in duplicate to archival CD for filing and shipping to the Indian company[2] that had been contracted to transcribe and encode (SGML at that time) the text. We chose manual transcription over OCR because it gave us roughly 99.9% accuracy versus 95% accuracy (at that time -- 1997-1999), and Apex only charged $.03/pg. That rate bought us two transcribers encoding each page and then compared against each other's work for error correction. In general we were quite happy with them.

Not including the staff, I think there was about $75k in hardware and about $5k/yr in consumables (mostly repairing book stands, buying CDs and replacing CD burners we wore out).

If anyone is curious, here are two projects I worked on: Early American Fiction (1789-1875): http://etext.lib.virginia.edu/eaf/ Walt Whitman Leaves of Grass archive (a whole bunch of versions): http://etext.lib.virginia.edu/whitman/

Note that these were both created 12-15yrs ago and offered both high quality scans and searchable text, and even basic comparison (split screen view of two texts: http://etext.lib.virginia.edu/whitman/whitframe2.html).

Subsequently, UVA's library has joined TEI and I'm sure things are much more modern now, but I wanted to provide a little more flavor to what you posted, with some more examples. Obviously, manuscripts in any language are time consuming to transcribe. They are often in poor condition and handwriting can be downright illegible, and don't get me started on issues with accurately transcribing original authors' own grammatical and spelling mistakes! Argh!

[1] http://www.phaseone.com/en/Camera-Systems/P-Series.aspx [2] http://www.apexglobal.in/apextranscription/index.htm


That's extremely interesting. Thanks for the detailed description. How many pages were your transcribers able to get through and at what rate? I've heard about the Whitman archive (it's often cited and quite famous I think). Are you still in the field? You know about Nines (http://www.nines.org/ for those that don't) I presume :)


It can always be transcribed later, the main thing is to get it digitized.


I concur, what is done is a step in the right direction. Once its digitized even scanned copies can be transcribed via some technology(may be OCR?) in the future if not now.


Not sure the state of OCR in other languages, especially obscure Indian ones. Might be cheaper to just have someone straight transcribe/translate these.


Google could probably help with that if they give them access to all those books, too.


Does anyone know what archival format the library is using?


The city's name is Thiruvananthapuram not Trivandrum.


Trivandrum is what the English called it. Bombay = Mumbai, same way. Let's not nitpick :)


Glad to see it online, but a 444 year old book on Alexander the Great is still ~ 2,000 behind when he lived.


>3,28,268 >1,84,321

That's an odd way to represent a number.


It's the way numbers are written in India. 1,00,000 ( = 100,000 ) is 1 lakh and 1,00,00,000 ( = 10,000,000 ) is 1 crore. Millions and billions are not used a lot, but most people know what they are. See http://en.wikipedia.org/wiki/South_Asian_numbering_system


To add to what others have said, the Indian number system is used outside of India in a few other Asian countries too.


Indian number system.

3,28,268 = 3 lakhs 28 thousand and 268

1,84,321 = 1 lakh 84 thousand and 321


yes, in simple

1 Lakh = 100 Thousand 1 Crore = 10 Million

They fall in between the million-billion range.

1--> billion 0--> 10 crore/100 million 0--> crore 0--> million 0--> lakh 0--> ten thousand 0--> thousand 0--> hundred 0--> tens 0--> ones


The modern decimal numbering system was invented in India. Without it we'd be using Roman numerals. What we call "Arabic numerals" is called "Indian numerals" in Arabic countries because that's where the Arabs got it from!


I think the library and the books survived because the Kochi-Thiruvithankur area was never under foreign rule.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: