看板: BudaTech ◎ 佛典電子化討論 板主: HeavenChow |
閱讀文章: 第 52/2032 篇 | 上篇 | 下篇 | 回覆 | 轉寄 | 轉貼 | m H d | 返回 |
發信人: b83050@cctwin.ee.ntu.edu.tw (Post Gateway), 信區: BudaTech 標 題: The Buddhist Scriptures on CD-ROM 發信站: 由 獅子吼站 收信 (Fri Mar 29 17:16:50 1996) THE BUDDHIST SCRIPTURES ON CD-ROM Dr. Supachai Tangwongsan Dr. Damras Wongsawang Miss Jiraporn Kiatpibool INTRODUCTION Many people said "This software would be another brave new world". It is the first of its kind in the universe of international Buddhism perusal. Mahidol University Computing Center (MUCC) is very proud to present the world's first complete digital edition of The Buddhist Scripture, Tipitaka which is a collection of scriptures representing the collected teachings and sayings of the Buddha and the scripture's commentary, the Atthakatha. The Tipitaka's importance is in being the root and basic reference for all teachings and explanations of Buddhism, the standard for measuring the teachings presented as Buddhism, a record of beliefs, religions, traditions and events of times many centuries past, an invaluable source of reference material relating to other fields of knowledge. Since the commencement of the Buddhist era over 2500 years ago, there have been continuous efforts to preserve and maintain the Tipitaka so that it remains as a religious heritage for the coming generations. Various media depending on the technology of the age have been used to preserve the contents of the Tipitaka e.g. using the method known as "Mukhapatha" or memorization and spread by "word of mouth", later, engraved stone , leaf, cloth, paper etc. were used to store the contents of the scriptures. With more complex technological advancement, various storage devices of the computer have been used to store the contents e.g. hard disk, optical disk and finally on CD-ROM which is 120 mm wide and 1.25 mm thick. What more, only a single CD- ROM carries the entire Thai and Romanized Pali of the Tipitaka (45 volumes) Attakatha (55 volumes) and special scriptures (15 volumes) totalling to more than 450 million characters. The CD-ROM is very small in size, is light and costs relatively cheap and needs very little care. Also, data in a CD-ROM is virus-safe which is a problem found with other computer media. BUDSIR (BUDdhist Scriptures Information Retrieval) on CD-ROM as named by the University, and the first of its kind will be available globally making the study and research of Buddhism virtually boundless. MUCC also developed the program BUDSIR which aids in the search and retrieval of the contents of the digital edition of the scriptures and its commentary. Development of BUDSIR took off grounds as a project to develop a computerized version of the Tipitaka in honour of His Majesty the King's Ratchamangklaphisek Ceremony (The celebration of the Longest Royal Enthronement Anniversary) and the celebration of His Majesty the King's 60th birthday. BUDSIR II, the first Romanized version of the Tipitaka was developed in September 1989 providing another channel through which the study of Buddhism is accessible to the international community. BUDSIR III was developed in April 1990 allowing more complex search queries using the mathematical concept of Boolean Algebra. His Majesty the King Bhumibhol Adulyadej The Great continued to support the study of computerizing of the Buddhist scriptures and its commentary and BUDSIR IV was developed in November 1991 which included 45 volumes of Tipitaka and 70 volumes of the Atthakatha and its related scriptures. BUDSIR IV includes both the Thai and Romanized Pali versions of the scriptures and is thus the most complete. BUDSIR IV was developed to store the scriptures on a hard disk which was found to be prone to virus attacks and often caused loss to information. BUDSIR on CD-ROM was thus developed and was completed in July 1994. BUDSIR's internal structure is elaborately developed using mature and efficient information retrieval techniques usually used in large databases and specially designed with the ease of use for users of all levels of competence in mind. OBJECTIVES In the endeavor to pursue a particular subject in Tipitaka and Atthakatha that contain tremendous amounts of information, not only does one have to overcome the barrier of the Pali language, but also overwhelming amounts of information so widely scattered under a variety of headings within a volume. Hence it is extremely difficult to retrieve the information in question, accurately and exhaustively. An attempt has been made to store the entire Tipitaka and Atthakatha in digital form so that any research that needs to gain access to this huge database will be greatly facilitated. BUDSIR is unique in its accuracy, speed and completeness. It can retrieve any word (including compounds), phrase or stretch of text that can be found in the Buddhist Scriptures. Moreover, this digital edition is also capable of searching both the Tipitaka and Atthakatha simultaneously, showing the results in two separate windows so that they can be studied and compared. THE DATA CONTENTS The Buddhist Scriptures included in the Digital Tipitaka and Attakatha consist of 115 volumes, or 50,189 pages of text. The data can be divided into two groups as follows: 1. The Pali Tipitaka in Thai script, Siamrattha version, 45 volumes with a total of 24 million characters. After computerized transliteration in Romanized script, the size becomes 31 million characters. 2. The Atthakatha, commentary and other important scriptures, 70 volumes with a total of 37 million characters comprising: a. The Atthakatha: 55 volumes, b. The text used in Thai monastic Pali examinations and two essential scriptures: The Milindapanha and The Bhikkhu Patimokkha-Pali. After computerized transliteration in Romanized script, the size increases to 47 million characters. The data was prepared with Pali text editor developed by the MUCC. The data from each volume was entered twice and verified by a computer program which pin-pointed any discrepancies between the two versions, which were then corrected until the two versions were identical. This was done by eighty typists, each working at a rate of thirty Pali words a minute, or on average 15 pages a day. THE BUDSIR DATABASE The database structure of the Digital Edition of Buddhist Scriptures is essentially an inverted file similar to that in the STAIRS system on the IBM main-frame. The system is composed of three main groups of data files: (1) the Text-block file, (2) the Dictionary file, and (3) the Inverted file. The Text-block file is a computerized collection of all the data from 115 printed volumes of the Tipitaka and Atthakatha. The Dictionary file is a collection of all lexical items found in the Tipitaka and Atthakatha. Each lexical items are arranged in the form of a B-tree structure with the pointers cross-referring to the hierarchical orders on the tree. The Inverted file actually is a list of occurrences of all the words found in the Text-block file. Each word will be cross-referred from the Dictionary file. The occurrence code consists of the volume number, page number, line number, word number and, when applicable, a flag to indicate last word of the line or the page. This is to facilitate data management in searching, particularly in adjacent words, including searching via Boolean operators for the future version. BUDSIR IV - FEATURES SUMMARY 1. Inherent B-TREE Architecture Since B-Tree has been known as the most efficient structure for any heavily accessed database. BUDSIR is crafted on this superb architecture. 2. Several Efficient Search Methodologies BUDSIR features 2 efficient search methods. User is able to launch a search using word/phrase keyword or using volume/page/item indicator. 3. Dual Windows Display BUDSIR independently displays the Tipitaka and the Atthakatha in separate windows. User is able to freehandedly select which window to display which manuscript. 4. Working brilliantly in graphical environment BUDSIR completely runs in graphics mode display; definitely no need to modify the video graphic adapter to display the characters. 5. Pull-Down Menus and Mouse Support Any feature can be accessed using hot-key, pull-down menus or a mouse. 6. Printing BUDSIR supports every de facto standard 9-pin and 24-pin dot- matrix printer and also HP Laser Jet printer or compatible. 7. Saving a Scripture Passage to Disk BUDSIR allows user to save any passage displaying on the screen to disk for private use. The text file saved by BUDSIR can be edited using general text editors. HARDWARE REQUIREMENTS To perform gracefully, BUDSIR essentially needs equipment along the following specifications: 1. An IBM PC, AT, PS/2 computers, or a true compatible using Intel- based 80386, or 80486 microprocessors, 2. At least 2 MegaBytes of RAMs, 3. A superVGA color graphic adapter and a matching monitor, 4. A standard CD-ROM drive for reading data on a CD-ROM, 5. A hard disk drive with capacity not less than 5 MB for BUDSIR's temporary working area, 6. A keyboard and a Microsoft compatible mouse, 7. A floppy disk drive, 8. A printer, 9. MS-DOS version 5 or higher. Moreover, for Macintosh users, BUDSIR IV can also run on Macintosh computers, e.g., Mac II, LC, Classic, Quadra, Power PC, etc., with SoftWindows (or SoftAT or SoftPC) emulator program and OS version 7.0 or higher. ___________________________________________________________ Authors Address : Mahidol University Computing Center, Faculty of Science, Rama VI Rd., Bangkok 10400, THAILAND Tel : (662) 247-0333, FAX : (662) 246-7308, Email : budsir@mahidol.ac.th . |
閱讀文章: 第 52/2032 篇 | 上篇 | 下篇 | 回覆 | 轉寄 | 轉貼 | m H d | 返回 |
卍 台大獅子吼佛學專站 http://buddhaspace.org |